ABSTRACT The behaviour of the voice source characteristics in connected speech was studied. Voice source parameters were obtained by automatic inverse filtering, followed by automatic fitting of the LF-model to the data. Consis- tent relations between voice source parameters and prosody were observed. Keywords: inverse filtering; LF-model; voice source 1. INTRODUCTION Many present day text-to-speech systems produce speechthat is intelligible, but doesn’t soundnatu ral. This lack of naturalness is at leastin part due to the absence ofvoice sourcecontrol rules. Numerous different v oice source modelshave been proposed, some of which cou ld be very useful forspeech synthesis. But if asophi sticated voice source model is to improve speech quality, one has to be able to control its parameters. Therefore, there is a need for data on the behaviour of the voice source, or more specifically of the behaviour of those characteris- tics of the source that can be mapped onto the model parameters. These data can be used to extract rules for the model parameters, that can then be used to improve synthesis. To extract rules a large amount of data is required. Bothinverse filtering of the speech, and fitting a model to the inverse filter results could be done by hand . However, this is time consuming, subjective and thus probably not reproducible. Therefore a procedure is de- veloped to derive the voicesource parameters autom ati- cally. Most research on voice source characteristics has dealt with sustained vowels, produced in different ways. For sustained vowels, recorded with a high SNR, automatic extraction of the voice parameters is fai rly easy. But from these data obtained from isolated speech segments it is difficult to formulate rules for whole ut- terances. Therefore, our aim is to study the behaviour of the voice source in connected, preferably spontaneous speech. And apart from the vowels we also want to ex- tract source parameters for voiced consonants, V/UV and UV/V transitions. Research on these topics is now in progress. In thisarticle someresults are presente d. Spe- cial attention is given to the relation between voi ce source dynamicsand prosody. 2. METHOD AND MATERIAL 2.1. SPEECH MATERIAL To study voice source characteristics data were ob- tained for four male subjects. For all subjects recordings were made of the speech signal, electroglottogram (EGG), subglottal (P sub ) and oral (P or ) pressure, lung volume, and electromyographic activityof some lary n- geal muscles (mostly crycothyroid, vocalis, and ster- nohyoid). For the current article only data of one male subject were used. Near theend of a recording sess ion he was asked to produce an utterance spontaneously. He then repeated this utterance 29 times. Theexperime nt is described in more detail in Strik and Boves (in press). For this paper inverse filter results were obtained for two of the 30 utterances. 2.2. INVERSE FILTERING The speech signals were transduced by a condensor microphone (B&K 4134) placed about 10 cm in front of the mouth, and amplified by a measuring amplifier (B&K 2607), using the built-in 22.5 Hz high-pass filter to suppress low frequency noise. The digitized speech signal was processed with a phase correction filter in order to undo the low frequency phase distortion. Closed glottis intervalcovariance LPC was used to estimat e the parameters of the inverse filter. In Veth, Cranen, Strik & Boves (1990) it was shown that this technique is as powerful as more sophisticated techniques, like Robust ARMA analysis. The moment of glottal closure was determined fromthe EGG. Inverse filtering yields a n es- timate of the differentiated glottal volume flow (dU g /dt); integration gives the flow signal (U g ). ON THE RELATION BETWEEN VOICE SOURCE CHARACTERISTICS AND PROSODY Helmer Strik& Louis Boves Department of Language and Speech, Nijmegen University, P.O. Box9103, 6500 HD Nijmegen, the Netherlands