ACCURATE PJRNErI'ER ESTIMATION OF NOISY SPEECH-LIKE SIGNALS
RAMDAS KUMARESAN and LXJNALD W. TLJFPS
Department
of Electrical Engineering
University
of Rhode Island
Kingston, RI 02881
ABH1IArr
Ass'.im 1 rir
thd3t 55 hC) i't seirierit of
ob'.e r'd speech st
:t rap be modeled the
tio:t :i.iii''.JI.sC? 1's''porist? oi
•:t :i.riei1' ss'L;rITi wt:'
C))t31Ii
t.lio Fc).t. tr I or' :ttinnt; of t.ho i'.•j.eiTi froli
thE €T'C)5 nt a poiurioiniiii.
of tii pt:)tric)ni:L a.i S T't 5C'I1I
(L c' y'tiiii-rri
bw two ftC ,tHTh1'3 ii'.ii'iti
decompositiori of' sri est:imated c'orrt:! IsLiort
matrix.
The speech—i ike srtal is used :i. ri
the reversed
time di rertloit to ider,tifii the
sinnai T'elSted reros nt' the poisriomial from
the rest •
Experimental results conPariflr.
the accuracw of the oie parameters obtained
bw different methods are iveri.
INTRODUCTION:
Speech sirials are modeled cuite
succesfuliw as the output of a linear swstem
(vocal tract) excited hs wide hand siirsals
(1), Mariw linear prediction (LP) based
algorithms (1) do riot dive accurate
estimates of' the role or pole/zero
parameters reresentiri the vocal tract
especially when the data record is short arid
roisy. If the order-of the speech model is
over estimated some of' these algorithms will
thve false formarit. locations (see section
III). Also for nasalized speech sir'ials the
parameters thven hw the LF based techriiejes
seem grossly inaccurate (2) * There has beer
numerous attenipts to solve this rrobleiti
Recent reports of Henderson (3) and Van
Blaricum (4) are closels coririected to our
work. They have attempted to obtain better
estimates of role locations from noisy
i rn P ui Sc re sPorts e dat a ijsi ri
eier;value/eier,vector deconpositior'i of an
estimated correlation matni;<, The methods
we present here (5,6) are in:pravemerits over
their techriiGues.
The salient features of our methods ar'e
as follows. We have previouslw used similar
ideas for resolvin closely aced spectral
peaks (7,8) • I ) We model the speech dat a as
an impulse response seeuerce or ecuivalentls
a sum of M exponentially damped sinals with
arbitrary phases arid amplitudes. 2) As in
other linear prediction based methods we
obtain the sole parameters of the mrulse
This research was
supported
by the Office of
Naval Research under the grant No. N0014—81—K—0144.
response from the zeros o' a polynomial,
similar to the prediction error filter
olwnomial (1). The key point is that the
accuracy of the coefficients of this
polynomial is ensured h usir the
eienvalue/eieniyector or emuivaler,tlv
sirular value deconipositiori (SUEt) of' the
corelatior,/data matrix, Once the pole
parameters are determined accurately the
zeros if ar,y, in the case of nasalized
sounds cars be determined usiri other methods
(see ref. (5)). 3) We use art L th deSree
polynomial 0(z), where L is larger than H
the number of exponeritials ir, the data.
Also L is chosen as a sizable fraction of N,
the number of data samples which is assumed
small, This turns out to he an important
factor in jmprovir the accuracy of the
estimates, 4) Lastly, we use the data in
the backward direction, to isolate the M
zeros of 0(z) related to the signal
parameters, called siral zeros from the
rest L—M, extraneous zeros (9).
I1 ESTIMATING THE POLE LOCATIONS:
We shall Fresent two closelw related
methods to determine the Pole locations from
noisy impulse response data, A short
semerit of a speech—like sirsal,
(ni),nl,2.. N, assumed to be a SUITI of' H
exponentially damped siniuidal sinials is
observed, That is, (rr) ae5, where 5K
arid
a
are complex r,unihers in enera1 We
construct a (N—L)x(L+l) data matrix A,
below usir the N sanuples of y(n)
I— * *
1J(t)
'j(.)
.
A = /
t.'L) 'j3). L* )
LNL 'N-L+).'
'*' deriotes comrlex &ontJuate, In both
methods we atternt to firiij a vector
(1
,
' ' )' with which we form
the polriomial G(z)=1+Kz. The H out of'
L zeros of 0(z) estimates of
k172.,. H. If the data is indeed a sum of
H exporienitials and noiseless then the
olvniomial 0(z), with coefficiert vector
which satisfies the homosnueous etuationi
will have zeros at e"5k=1,2,, H. But
L has to satisfy the iruemuality ML(N—L)
1357
CH 1746•7!8210000 1357 $ 00.75 © 1982 IEEE