Using the Speech Transmission Index for predicting non-native speech intelligibility Sander J. van Wijngaarden, a) Adelbert W. Bronkhorst, Tammo Houtgast, and Herman J. M. Steeneken TNO Human Factors, PO Box 23, 3769 ZG Soesterberg, The Netherlands Received 5 March 2003; revised 10 February 2003; accepted 15 December 2003 While the Speech Transmission Index STIis widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions for sentence intelligibility in noise, for populations of native and non-native communicators, a correction function for the interpretation of the STI is derived. This function is applied to determine the appropriate STI ranges with qualification labels ‘‘bad’’–‘‘excellent’’, for specific populations of non-natives. The correction function is derived by relating the non-native psychometric function to the native psychometric function by a single parameter . For listeners, the parameter is found to be highly correlated with linguistic entropy. It is shown that the proposed correction function is also valid for conditions featuring bandwidth limiting and reverberation. © 2004 Acoustical Society of America. DOI: 10.1121/1.1647145 PACS numbers: 43.70.Kv, 43.71.Hw, 43.71.Gv KWGPages: 1281–1291 I. INTRODUCTION The intelligibility of speech is generally considered to depend on the characteristics of the talker and the listener, the complexity of the spoken messages, and the characteris- tics of the communication channel. Objective speech intelli- gibility predictions models have been shown to accurately predict the influence of the communication channel charac- teristics on speech intelligibility. An example of such a model is the Articulation Index AImodel French and Steinberg, 1947; Kryter, 1962, and more advanced models based on the AI, such as the Speech Intelligibility Index SII; ANSI, 1997and the Speech Transmission Index STI; IEC, 1998; Steeneken and Houtgast, 1980; Steeneken and Hout- gast, 1999. In some cases, the overall speech intelligibility that is experienced is clearly affected by factors other than the physical characteristics of the channel. Individual talker dif- ferences Bradlow et al. 1996; Hood and Poole, 1980and message complexity Pollack, 1964were already men- tioned. Other examples are individual differences in speaking style Picheny et al. 1985and hearing loss Plomp, 1978. An important determining factor for speech intelligibil- ity is language proficiency, of talkers van Wijngaarden et al., 2002aas well as listeners van Wijngaarden et al., 2002b. Learning a language at a later age results in a certain degree of limitation to language proficiency Flege, 1995. So-called non-native speech communication is practically al- ways less effective than native communication. The intelli- gibility effects of non-native speech production and non- native perception show an interaction with speech transmission quality the quality of the channel. Speech de- grading influences such as noise Buus et al., 1986; Floren- tine et al., 1984; Florentine, 1985and reverberation Na ´ - belek and Donahue, 1984aggravate the intelligibility effects of non-native speech communication. For various applications, it would be very useful to have an objective, quantitative intelligibility prediction method that is capable of dealing with non-native speech. In Sec. II of this article, the suitability of existing objective speech intelligibility prediction models for non-native applications is discussed. Section III continues by proposing a way in which the Speech Transmission Index STIcan be used in various non-native scenarios. Section IV contains a validation of this approach for speech in noise, bandwidth limiting, and rever- beration. II. SUITABILITY OF OBJECTIVE INTELLIGIBILITY PREDICTION MODELS FOR NON-NATIVE SPEECH A. Speech transmission quality versus speech intelligibility Speech intelligibility can be thought of as the success that a source and a receiver talker and listenerhave in transmitting information over a channel. Each unique talker– listener pair has a certain potential for transmitting messages of a given complexity. The quality of the transmission chan- nel determines how much of this potential is realized. A typi- cal transmission channel could be a phone line, a public ad- dress system, or the acoustic environment of a specific room. Objective prediction models are especially good in quantifying speech transmission quality. The influence of factors determining speech intelligibility related to talkers and listeners, rather than the channel, has been incorporated to a lesser degree. A proficiency factor has been proposed Pavlovic and Studebaker, 1984for incorporating talker- and listener-specific factors into the framework of the articu- lation index, but this has not been developed to a level where practically useful predictions can be obtained. a Electronic mail: vanwijngaarden@tm.tno.nl 1281 J. Acoust. Soc. Am. 115 (3), March 2004 0001-4966/2004/115(3)/1281/11/$20.00 © 2004 Acoustical Society of America