ISSN (Print) : 2320 – 3765 ISSN (Online): 2278 – 8875 International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering (An ISO 3297: 2007 Certified Organization) Vol. 4, Issue 12, December 2015 Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2015.0412046 9697 HMM-Based Analysis and Synthesis of Emotional Assamese Speech With Reference to its Prosody Features Purnendu Acharjee 1 , Jyotismita Talukdar 2 Assistant professor, Dept. of CSc, AIMT, Guwahati, Assam India 1 Assistant professor, Center of IT, UTM, Shillong, Meghalaya, India 2 ABSTRACT :The present investigation focuses on how expressive content is apparent in the acoustic signal a speaker produces and also on listener reaction to the signal. In this paper the emotional features of Assamese language is presented. While the mono-thongs, diphthongs and trip thongs are considered with respect to different prosody we have seen that length of the Vowel not recognized as a distinguishing feature in Assamese . But in restrained deliberation, meaning- differentiating vowel length is a big issue towards the pronunciation of Assamese diphthongs and trip thongs. It is found that there is no change of meaning for vowel length written with symbols as short ( „hrashya‟) or long („dirgha‟). Assamese word pronunciation does not differentiate between long and short vowels where as its orthography kept the prerequisite of short and long symbols. In the present study We are considering three basic emotional features of Assamese language : they are normal(Neutral), angry and surprise. It has been observed that the Assamese vowels /a// আ / and /u//উ / shows distinction up to frame 12 th for surprise and angry emotions and from 17 th frame onwards they seem to be similar in all the three emotions (prosody). But in case of the Assamese vowels /i// ই/ and /o//ও / shows similarity up to 9 th frame in Surprise and angry emotions and then shows dissimilarity up to 15 th frame and then becomes flat at the end in all the emotions (prosody). For monothongs [say for Example: অ/a:/ আ/a/ ই/i/ এ/e/ ও/o/] it is observed that /a//আ/ and /u//উ/ shows clear cut distinction frame 9 th -12 th for Neutral, surprise and angry emotions and from 15 th frame onwards shows similarity. On the other hand , in case of /i//ই/ and /o//ও/ , there is similarity up to 11 th frame in Neutral, Surprise and angry emotions and but after that shows distinctions up to 17 th frame. Similarly, for diphthongs ,like /a//আ/ with /i//ই/ and /u//উ/ shows distinction up to frame 13 th for three emotions and rest frame onwards they show similarities in spectral behaviors. It is found that in the emptional speech , there are there are three VOT patterns found in Assamese language .(i) ( pre-voicing )Negative VOT, where the vocal cords starts vibrating before the stop consonant release and an interval from -125ms to - 75ms. (ii) (simultaneous)Zero VOT, where the vibration of the vocal cords starts vibrating more or less simultaneously to the release of plosive within an interval from 0 ms to +35ms, and (iii) (aspiration)Positive VOT: where a delay pursues the plosive release and the vocal cords start vibrating after a 35ms to 100ms interval. While Assamese native speaker utters sentences in emotions then it is seen that quite a few prosody factors affect this phonetic phonological characteristic such as place of articulation, syllable stress, rhythm, speech rate, number of syllables and vowel quality etc. Since values of VOT vary broadly depending on emotional status of a speaker so equally VOT length and sentence length need to be computed with the intention to establish what proportion of the sentence occupied by the VOT to obtain relative VOT (emotional states) in the target sentence. KEYWORDS : emotion, VOT, HMM, ASR, GMM, synthesis,HPTS. cepstrum. I. INTRODUCTION The analysis of Speech emotion is highly associated with the speech production mechanism. The entire speech acoustics play an important role while interpreting the meaning of particular acoustic parameters . The air flow through the vocal tract, and thereafter, powered by respiration is the basis of all sound making with the human vocal apparatus. Interestingly,