Use of a Novel Generalized Fuzzy Hidden Markov Model for Speech Recognition zy Adrian David Cheok, Sylvain Chevalier* Mustafa Kaynak, Kuntal Sengupta, KO Chi Chung National University of Singapore, Singapore and *Institut National Polytechnique De Grenoble, France Abstract zyxwvutsr In this p a p q we discuss a novel iype zyxwvutsrq of hidden Markov model (HMM) based zyxwvutsrqpon on zyxwvutsrqpo fuzzy sets and zyxwvutsrqpon fizzy integral theory which gener- alizes the classical stochastic HMM. The Choquet integral is used as a fizzy integral which relaxes one of the two independence as- sumptions that we had with the classical HMM. We apply this new model to speech recognition and compare the performance with the classical HMM. In this research, the main innovation zyxwvutsrqpo is that this new generalized fuuy HMM is applied for the first time to speech recognition. Due to the fuuyness of the model, an interesting gain can be observed in terms of a lower computation time. Keywords: Speech Recognition, Generalized HMM, Fuzzy Inte- grals 1 Introduction Speech recognition has become a powerful human-computer in- terface with many advantages. However, there are still many re- search problems to resolve in speech recognition, as it is still often not completely robust or efficient for certain applications [l]. Nev- ertheless, speech recognition systems today can obtain high accu- racy with the utilization of neural networks, fuzzy logic and hidden Markov models (HMM). Today, the HMM is the most widely used, and its strong mathematical base allows many new studies to improve its efficiency. Recently, a novel generalization of HMM has been introduced [2] and successfully applied to hand-written recognition [3]. This new model uses a fuzzy approach instead of a stochastic one for the classical HMM. This requires a new computation of in- tegrals and summations that use fuzzy integrals and fuzzy measures. The advantage of using these new fuzzy operators (which are less constrained than classical integrals and probabilities) is that they re- lax the independence assumptions that are necessary with probability functions. Interestingly, it should be noted that one particular case in the choice of fuzzy integral (the Choquet integral), fuzzy measure (probability measure), and fuzzy intersection operator (multiplica- tion), reduces the generalized fuzzy HMM to the classical HMM. The fuzzy measure that is usually used, the possibility measure, is based on the max operator. This gives a constant result for small variations of the non-maximum arguments, and provides a signifi- cant gain in terms of lower computation time. In this work, the main contribution is that this new model is ap- plied for the first time to speech recognition, in order to show the feasibility of the new Generalized Fuzzy HMM (GFHMM) for this application, and to compare its performance with the classical HMM. The results show that in the case of a small lexicon, the results in terms of recognition are similar for fuzzy and classical HMMs. A gain can be observed in terms of a lower computation time, as the training process of the model is usually faster for the generalized model. This is a consequence of the fuzzy approach where fuzzy models require less accurate parameters to perform as well. In this paper, filstly a brief background to HMM is given in sec- tion 2, then in section 3 the theory of fuzzy measures and fuzzy in- 0-7803-7293-X/Ol/$17.00 zyxwvutsrqpo 0 2001 IEEE tegrals is explained. In section 4, the generalized hidden Markov model theory is introduced, and results in speech recognition are given in section 5. 2 Hidden Markov Models The Generalized Fuzzy Hidden Markov model is based on the classical HMM. The classical HMM is a stochastic signal model that is viewed as an extension of Markov chains [4]. A Markov chain is a set of N states S1 ... SN with transitions between them. At each time t, the current state is qt, and the transitions are defined with respect to the current state and the predecessor state in terms of probabilities: aij = P(qt = Sj Iqt-1 = Si), where aij is the state transition prob- ability. For discrete, first order Markov chains, a process is defined with the coefficients aij stored in a square N by N transition matrix A that is independent of the timet and with an initial state probability ?ri = P(q1 = Si). These coefficients are stored in a N dimensional array T. Then we can compute the probability of a state sequence Q =q1 ... qTdefinedasP(Q) =TnITnl*anlna*an2ng* ...*anT--lnT, where qi = zyxw Sni. The principle of an hidden Markov model is that the state se- quence can not be observed directly, but only through some ob- servation sequence 0 = zyxw 0102 ... OT that results from the actual state sequence Q = q1Q2...QT according to the observation sym- bol probability distribution defined by the matrix B = [bjk], with: bj(k) = P(0t = vklqt = Sj), where V = (w1,va ,..., tm} is the set of observation symbols. Hence, one hidden Markov model is fully defined by the matrices A, B and T, and the following notation is used for a model X [4]: X = (A, B, T). For usual applications of HMMs, the utilization of the model can be divided in three problems [4]: Problem 1: Computation of an observation probability according to the model : P(0IX). This is the problem of recognition, and it is usually solved using the forward-backward procedure. Problem 2: Computation of the state-sequence which fits the best to an observed sequence. The Viterbi algorithm can solve this problem. Problem 3: Computation of the model parameters A, B and T to maximize the probability of one observation. This is the problem of training, and re-estimation formulas are the inductive equations of the training process. In most applications of HMMs, such as in speech recognition, the observation does not have values in a finite set, but instead these observations are continuous signals. The most convenient way to model continuous probability densities is to use a set of Gaussian probability density functions (pdf). The observation density bj can then be written as a sum of Gaussian pdf’s : M bj(0) =z C cjmG[O,~jm, Ujm], for 1 5 j 5 N (1) m=l In this equation, M is the number of mixture components, cj, is the mixture coefficient at state j, for the mth component, G is the standard Gaussian pdf, with mean pjm and cova+mce matrix Uj,. 1207 2001 lEEE International Fuzzy Systems Conference