Generalization of Hidden Markov Models Using Fuzzy Integrals zy Magdi Mohamed and Paul Gader Departmentof Elecaical and Computer Engineering University of Missouri zyxwvu - Columbia I. INTRODUCTION The statistical methods of hidden Markov modeling were initially introduced and studied in the late 1960s and early 1970s. They have been found to be extremely useful for a wide spectrum of applications in ecology, cryptanalysis, speech and handwriting recognition zyxwvut [l-51. In this paper, we generalize hidden Markov models using fuzzy integrals. 11. CLASSICAL HIDDEN MARKOV MODELS A hidden Markov model is a doubly embedded stochastic process with an underlying process that is not observable (it is hidden), but zyxwvutsrq can only be observed through another set of stochastic processes that produce the sequence of observations [l]. This means that a probabilistic function of a hidden Markov chain is a stochastic process generated by two interrelated mechanisms, an underlying Markov chain having a finite number of states, and a set of random functions, one of which is associated with each state. At discrete instants of time, the process is assumed to be in some state and an observation is generated by the random function corresponding to the current state. The underlying Markov chain then changes states according to its transition probability matrix. There are a finite number, say N, of states in the model. At each clock time a new state is entered based upon a transition probability distribution which depends on the previous state (the Markovian property). After each transition is made, an observation output symbol is produced according to a probability distribution which depends on the current state. This probability distribution is held fixed for the state regardless of when and how the state is entered. There are thus N such observation probability distributions which, of course, represent mdom variables or stochastic processes. We now formally define the following model notation for a first order discrete observation HMM: zyxwvuts T N zyxwvutsrqp M S = length of observation sequence (total number of clock times) = number of states in the model = number of observation symbols = (SI, S2, ..., SN }, states v = {VI, V2. V3, ..., VM } discrete set of possible observations qt = state visited at time zyxwvuts t Q = 41.42, ..., qT a state sequence A B = {bj(k)], bj(k)=P(vkatt Iqt=Sj), x = {ajj 1, aq = P ( qt+I=Sj I qt=Si ), state transition probability distribution observation probability distribution in state j = (Xi }, ni = P (41 = Si), initial state distribution. We use L = (A,B,a ) to represent the model. There are three key problems that must be solved for the model to be useful in real world applications [ 11. Two of these problems are the following: A. The Classijication Problem The probability of an observation sequence 0 = 01.02, ..., zyxwv 0, given a model h, P(Olh), can be used to perform classification. The straightforward way of computing P(0lh) is by enumerating every possible state sequence. Assuming statistical independence of observations, it follows that: P(0lh) = CP(0,QIA) all Q T This method of computing P(O1h) requires O(TNT) computations. A method called the Forward-Backward procedure takes O(nV 2, computations.Consider the forward variable a,(i) defmedas at(i) = P(OIO 2...0t, 4, =Si I h) (2) We can solve for a, (i) inductively as follows: Initialization: For all I 5 i I N a,(i) = zibi(Ol) (3) 0-7803-2125-1/94 $4.00 0 1994 IEEE 3