Lexical Tone Perception in Musicians and Non-musicians Jennifer A. Alexander 1 , Patrick C.M. Wong 2,3 , and Ann R. Bradlow 1 1 Department of Linguistics 2 Department of Communication Sciences and Disorders 3 Northwestern University Institute for Neuroscience Northwestern University, Evanston, IL U.S.A. jenalex@northwestern.edu Abstract It has been suggested that music and speech maintain entirely dissociable mental processing systems. The current study, however, provides evidence that there is an overlap in the processing of certain shared aspects of the two. This study focuses on fundamental frequency (pitch), which is an essential component of melodic units in music and lexical and/or intonational units in speech. We hypothesize that extensive experience with the processing of musical pitch can transfer to a lexical pitch-processing domain. To that end, we asked nine English-speaking musicians and nine English- speaking non-musicians to identify and discriminate the four lexical tones of Mandarin Chinese. The subjects performed significantly differently on both tasks; the musicians identified the tones with 89% accuracy and discriminated them with 87% accuracy, while the non-musicians identified them with only 69% accuracy and discriminated them with 71% accuracy. These results provide counter-evidence to the theory of dissociation between music and speech processing. 1. Introduction Music and speech have much in common. For one thing, they represent the two most cognitively complex uses of sound- based communication across any of the world’s species. In addition, both speech and music are generative: a finite number of simple elements such as pitches or segments combine hierarchically (“syntactically”) to create increasingly complex, meaningful structures, like words and utterances, melodies and songs [1]. In view of these functional similarities, a considerable body of previous work has investigated the extent to which music and speech share common processing mechanisms. One possibility, which assumes a strict interpretation of modularity, is that music and speech are cognitively unique and distinct in that they maintain discrete mental processing systems [2]. A number of behavioral and imaging studies produced within this framework have given rise to the idea of hemispheric lateralization or dominance, which suggests that linguistic processing takes place within the left hemisphere of the brain, while music processing occurs in the right hemisphere [3-4]. An alternative possibility is that hemispheric dominance pertains to particular aspects of auditory processing and that shared acoustic features of speech and music will be processed similarly. In support of this alternative several studies have shown that the left hemisphere tends to handle phonemic processing – the processing of words, syllables, and lexical tones – while the right processes melodic and prosodic units, like musical phrases, intonational phrases, pitch contours, and affect (see, e.g., [5-14]). Moreover, the picture of hemispheric dominance for separate aspects of processing is not as absolute as it may seem. As Wang et al [14] note, the lateralization of the brain is but a tendency; “dominance” does not always exclude activity in the other hemisphere. Recent behavioral and neural studies have shown that the lateralization boundaries can in fact be blurred. Certain shared aspects of music and speech, such as hierarchical (“syntactic”) organization, appear to be processed in overlapping areas, suggesting that there are common neural mechanisms subserving speech and music [15]. If this is the case, then it would not be surprising to see behavioral manifestations of sharing between music and speech. The current study seeks to identify another such similarity in music and speech processing by directly investigating the effect of experience-dependent learning in one domain (in this case, music) on processing in the other domain (in this case, speech). Specifically, we hypothesize that there is an overlap in the processing of fundamental frequency (pitch) in these two domains such that extensive experience with pitch processing in music will be manifested as “enhanced” pitch processing in speech. Systematic pitch variation is a fundamental feature of both music and speech; music incorporates pitch changes within a specified parameter (the “key” of the piece) to express compositional and affective meaning, while in speech, pitch is used to convey pragmatic information and, in the case of tone languages, lexical information (i.e., contrasts in word meaning can be conveyed via pitch pattern alone). In this study, American English- speaking musicians and non-musicians with no tone language experience were asked to identify and discriminate the four lexical tones of Mandarin Chinese (high-level, high-rising, low-dipping, and low-falling). Results suggest that extensive experience with musical pitch processing may facilitate lexical pitch processing in a novel tone language to a significant degree. 2. Methods 2.1. Subjects Three groups of subjects participated in this study. The first was comprised of five adult female native speakers of Mandarin Chinese who also speak English (“Mandarin speakers”). All the subjects in this group ranked Mandarin as their dominant language relative to English and spoke mostly Mandarin in their childhood. The Mandarin speakers ranged from 24 to 36 years of age, with a mean age of 29.8 years and SD of 5.4 years. The second group consisted of seven adult female and two adult male native speakers of