1 Spectro-temporal processing of speech – An information-theoretic framework Thomas U. Christiansen 1 , Torsten Dau 1 , and Steven Greenberg 1,2 1 Center for Applied Hearing Research, Ørsted•DTU, Acoustic Technology, Technical University of Denmark, Ørsteds Plads, bldg. 352, DK-2800 Kgs. Lyngby, Denmark {tuc, tda}@oersted.dtu.dk 2 Silicon Speech, 46 Oxford Drive, Santa Venetia, CA 94903, USA, steveng@savant- garde.net 1 Introduction Which acoustic cues are important for understanding spoken language? Traditionally, the speech signal is described mainly in spectral terms (i.e., the distribution of energy across the acoustic frequency axis). In contrast, temporal properties are often ignored. However, there is mounting evidence that low- frequency energy modulations play a crucial role, particularly those below 16 Hz (e.g., Christiansen and Greenberg 2005; Drullman, Festen and Plomp 1994; Greenberg and Arai 2004; Houtgast and Steeneken 1985). Modulations higher than 16 Hz may also contribute under certain conditions (Apoux and Bacon 2004; Christiansen and Greenberg 2005; Greenberg and Arai 2004; Silipo, Greenberg and Arai 1999). Currently lacking is a detailed understanding of how amplitude- modulation cues are combined across the acoustic frequency spectrum, as well as how spectral and temporal information interact. Such knowledge could enhance our understanding of how spoken language is processed in noisy and reverberant environments by both normal and hearing-impaired individuals. 2 Experimental Methods The current study investigates the spectro-temporal cues associated with identification of Danish consonants through systematic filtering of the modulation spectrum in different regions of the audio frequency spectrum. Because of speech’s inherent redundancy, much of the signal’s audio frequency content must be