Supplement to the Proceedings of the 36 th Boston University Conference on Language Development 1 The effect of variation on phonetic category learning Madelaine Krehm 1 , Adam Buchwald, and Athena Vouloumanos New York University Introduction To understand speech, we must organize its natural variation into meaningful perceptual units that are appropriate to our language. Our perceptual systems divide the continuously varying speech signal into language-specific phonetic categories. For example, one linguistic parameter that varies continuously within and between languages is voicing, the vibration of the vocal cords. The timing of voicing onset relative to the release of the consonant is referred to as voice onset time (VOT). Different languages divide the voicing continuum differently, both in terms of the number of phonetic categories they specify within a continuum, and in terms of the specific VOTs that mark phoneme boundaries. For example, English has two alveolar stops /d/ and /t/ with mean voice onset times of 5 ms and 70 ms, respectively. In contrast, Thai divides alveolar stops into 3 categories /d/ (with mean VOT of -78 ms), /t/ (9 ms) and /t h / (65 ms; Lisker & Abramson, 1964). We perceive a continuous feature like VOT as specifying distinct, language-specific categories through categorical perception: Given the same magnitude of acoustic change, adult native speakers perceive a greater difference across the boundary between phonetic categories than within a phonetic category (e.g., Goldstone & Hendrickson, 2010; Liberman, Harris, Hoffman, & Griffith, 1957). Infants and adults learning English or Thai are faced with the task of learning how many alveolar stop categories their language has (two vs. three) and the specific VOT boundaries that distinguish these categories. In this paper we examine how adults learn the relevant phonetic categories for different languages. Learners face yet another challenge when learning phonetic categories. Although the speech signal varies on dozens of different dimensions, only some, such as voicing, are crucial to understanding a given language, while others are irrelevant to that language. For example, change in VOT could change “ten” into “den”, altering the meaning of the word, but a change in pitch does not affect word meaning in English. While pitch contours are irrelevant to word meaning in English, they are important for distinguishing tokens with different meanings in other languages. In Mandarin /ma/ with a flat pitch contour means mother, but /ma/ with a falling pitch contour means scold. Since languages differ in which dimensions of variation are linguistically relevant, listeners must learn which dimensions matter for the specific language they are learning. One commonly proposed model for general category learning, the exemplar model, can be applied to the acquisition of phonetic categories in speech (Pierrehumbert, 1 Reprint requests should be addressed to Madelaine Krehm, Department of Psychology, New York University, 6 Washington Place, New York, NY, 10003, USA, e-mail: madelaine.krehm@nyu.edu.