Effect of sound localisation on melody segregation Marie Camilleri 1,2 , Jeremy Marozeau 1 , Hamish Innes-Brown 1 , and Peter Blamey 1,3 1 The Bionic Ear Institute, East Melbourne, Australia 2 The University of Montpellier, France 3 The University of Melbourne, Australia Abstract The study presented here tests the effect of localisation cues in auditory streaming of musical sounds. A psychoacoustic experiment was run with 7 normal-hearing listeners, to test the segregation of a 4-note repeating melody from pseudo- random interleaved distracter notes. During the experiment, the location of distracter notes was moved through 7 speakers while the melody location was fixed. The results show that listeners need a minimum of 30° azimuth to segregate the melody from the distracter when the melody was presented in front of them and a minimum of 60° when the melody was presented on the left or right side. Key-words: sound segregation localisation music perception auditory grouping binaural hearing. 1. INTRODUCTION It is known that hearing impaired listeners have difficulties understanding speech in noisy environments, and appreciating music. These phenomena may be explained by erroneous auditory scene analysis caused by incomplete information from hearing aids and cochlear implants [1-3]. Auditory scene analysis has been studied by many authors including van Noorden [4], Deutsch [5], Hartmann [6] and Bregman [7]. These authors have shown that when the frequency separation between two rapidly alternating tones is sufficiently large, the pattern perceived breaks up into two melodies. Attention and training also play a role in integration/segregation process [8, 9]. Music can be defined as organised sounds with musical intent that can be decomposed into two orthogonal dimensions [7]. The horizontal dimension refers to the temporal organisation of sounds that define the melody. The vertical dimension refers to the relationship between simultaneous sounds that define harmony. The various perceptual cues characterising musical sound are, pitch, timbre, , , and loudness. Each of these cues has one or more corresponding physical parameters in the acoustic signal: fundamental frequency, spectral and temporal profile, and intensity. The fundamental frequency effectively transmits the pitch of sound, while timbre and melody are encoded by spectral and temporal profile of the music. The temporal envelope is also essential to perceive rhythm, especially the onset and offset of sounds [2, 10]. Although stream segregation is possible in monaural hearing [5], auditory scene analysis is more effective in binaural hearing (see reviews by Blauert [11], Middlebrooks [12], Darwin [13] and Akeroyd [14]). This binaural advantage arises from the combined effects of redundancy, head shadow and binaural squelch (the ability to extract a signal from a noise presented binaurally). Binaural hearing allows the comparison of the acoustic signals from each ear, especially the analysis of interaural time differences (ITDs) and interaural level differences (ILDs), which vary with sound source azimuth. The interaction between ILD and ITD in different frequencies is called the duplex theory. ITD is more effective in low frequencies, while due to the head-shadow effect, ILD is the most useful cue to detect direction of high frequencies. Recently, Stainsby [15] has studied the influence of ITD manipulation on streaming. He observed that the differences in spatial location produced by ITD have only weak effect on auditory streaming. Few studies have observed the effect of localisation cues in musical streaming. Saupe [16] tested 20 normal hearing listeners in a polyphonic music environment. The task was to detect targets and ignore distracters in different locations. Results showed that spatial separation of 28° between targets and distracters was sufficient for a significant improvement in target detection. The influence of fundamental frequency, visual cues, and musical training have been observed in previous studies [8, 17]. This study proposes to evaluate the effect of localisation cues in musical stream segregation, and to determine the minimum angle necessary to start to segregate a simple melody from interleaved distracter notes - for listeners with normal and impaired hearing. These results are important for the development of the next generation of hearing devices, which may use localisation cues to improve music appreciation. Only the data from the normal hearing listeners will be discussed in this paper. 2. METHOD 1.1 PARTICIPANTS Seven normal hearing adults (4 females and 3 males) were tested. The mean age was 37 years (SD 17.5, range 20-59 years). Hearing threshold level was less than 20 dB hearing loss between 250 to 8000 Hz for each participant. Also, localisation measures for sounds from a single speaker at a time showed a good balance between the two ears and good localisation ability for each loudspeaker for each participant. 1.2 STIMULI The melody notes were constructed using Matlab 7.5 and presented using MAX/MSP 5 through an M-AUDIO Firewire 48-kHz 24-bit sound card. Each note consisted of a 180 ms complex tone with 10 harmonics. Each harmonic was ISBN 978-0-9581946-3-1 2010 ASSTA 14 -16 December 2010, Melbourne, Australia SST 2010 Accepted after peer review of full paper 197