MEASURING RESONANCES OF THE VOCAL TRACT USING FREQUENCY SWEEPS AT THE LIPS F. Ahmadi and I. V. McLoughlin School of Computer Engineering Nanyang Technological University Nanyang Avenue, Singapore 639798 ABSTRACT Precise measurement of the resonances of the human vocal tract is important in the research of acoustical phonetics. It has also signiﬁcant applications in speech therapy and lan- guage learning – providing feedback about the shape of the vocal tract and position of the tongue. This paper investigates a novel method of measuring these resonances using linear frequency sweeps at the lips. To investigate the effectiveness of the method, tests have been completed on constructed tube models of the vocal tract and also human subjects for six En- glish vowels. The precision of the measurement in the current implementation is shown to be superior compared to tradi- tional electro-larynx method. Index Terms— Speech, vocal tract, Kelly-Lochbaum 1. INTRODUCTION Formant frequencies are peaks of the spectrum of the vocal tract. This spectrum is being sampled with harmonics of the pitch frequency when a voiced phoneme of speech is gener- ated. If this phoneme is used to estimate the formants, the precision of estimating cannot be very much better than the harmonic spacing of the pitch. The lack of precision be- comes more signiﬁcant, particularly when the pitch frequency is comparable to or greater than the resonance frequency of interest. Consequently, it is more challenging to determine the formants of high-pitched voices (such as children and some women). Techniques for measuring vocal tract resonances can be classiﬁed in four groups: i) Estimation from formants of normal speech (e.g. linear prediction [1]) which has the short comings mentioned above, ii) estimation from whis- pered speech [2] which is intrinsically noise excited and the resulting formants become noisy, iii) estimation using an ex- ternal source at the glottis [3] which needs three to four times acoustic power of the speech signal and makes the subject un- comfortable and iv) using an external source at the lips which is more accessible and provides better precision [4]. Refer- ring to the theorem by Epps et. al. [5], resonances of the vocal tract can be measured as the peaks of the vocal tracts impedance Z VT , measured at the lips. Vocal tracts impedance is in parallel the external radiation impedance Z ǫ which is de- ﬁned as: Z ǫ = az jkr 1+ jkr (1) with k being the wavenumber k =2πf/c, c the speed of the sound, f the frequency, r the radial distance, z the spe- ciﬁc acoustic impedance and a being a geometrical constant [6]. In the current work f ≤ 3.5 kHz and r is several mil- limetres (speciﬁcally, it is the radial distance between the lips and the microphone). This means that kr << 1 in eqn. 1 and consequently, Z ǫ ≈ jkraz. The sound source at this re- search derives the external radiation impedance Z ǫ and Z VT in parallel. Z ‖ = 1 1/Z VT +1/Z ǫ (2) Fig. 1. A 22-tube Solidworks model and cutaway representation for vowel /u/, constructed to match the area functions of [7]. In eqn. 1, Z ǫ increases only linearly with frequency (as endorsed by Z ǫ ≈ jkraz) but there are relatively strong reso- nances in Z VT over the frequency range of application. Con-