www.tjprc.org editor@tjprc.org International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN(P): 2249-6831; ISSN(E): 2249-7943 Vol. 4, Issue 2, Apr 2014, 285-290 © TJPRC Pvt. Ltd. SPEECH RECOGNITION OF HINDI PHENOMES USING MFCC AND BHATTACHARRYA HISTOGRAM DISTANCE SANDEEP KAUR 1 , MEENAKSHI SHARMA 2 & SUKHBEER SINGH 3 1 M.Tech Student, Department of CSE, Sri Sai College of Engineering, Pathankot, Punjab, India 2 Department of Head, Sri Sai College of Engineering, Pathankot, Punjab, India 3 Assistant Professor, Sri Sai College of Engineering, Pathankot, Punjab, India ABSTRACT This paper describes an algorithm that takes advantage of the distance measures for finding similarity between the histogram profiles of the feature matrix made of audio signals (Hindi Phenomes). The results obtained with Swaranjali for tests conducted on a vocabulary of Hindi digits of different speaker. Many researchers have used the root mean square (rms), log spectral distance, cepstral distance, likelihood ratio (minimum residual principle or delta coding (DELCO) algorithm), and a cosh measure (based upon two non symmetrical likelihood ratios), however feature matrix profile based measure was not used, which has distinct advantage when it comes finding similar features for voice profile recognition. Bhattacharyya histogram is used to measure the distance between the histogram profiles of the feature matrix made of audio signals (Hindi Phenomes). KEYWORDS: MFCC, K-Means Algorithm, Framing, Windowing, Hamming Window, Fast Fourier Transform, Mel-Scaled Filter Bank, Bhattacharyya Coefficient, ROC Curve INTRODUCTION Speech recognition is the process by which a algorithm identifies spoken words. Basically, it means talking to your algorithms and having it correctly recognize what one is saying in simple words. However, the basic terms for understanding the basic are: Utterance, Speaker Dependence, Vocabularies. The term phoneme was reportedly first used by A. Dufriche-Desgenettes in 1873, but it referred only to a speech sound. The term phoneme as an abstraction was developed by the Polish linguist Jan Niecislaw Baudouin de Courtenay and his student Mikolaj Kruszewski during 1875–1895[1]. The term used by these two was fonema, the basic unit of what they called psychophonetics. The concept of the phoneme was then elaborated in the works of Nikolai Trubetzkoi and others of the Prague School (during the years 1926–1935), and in those of structuralists like Ferdinand de Saussure, Edward Sapir, and Leonard Bloomfield. Some structuralists (though not Sapir) rejected the idea of a cognitive or psycholinguistic function for the phoneme[2][3]. Units of Speech A phoneme is a basic unit of a language's phonology, which is combined with other phonemes to form meaningful units such as words or morphemes. The phoneme can be described as "he smallest contrastive linguistic unit which may bring about a change of meaning". [6] In this way the difference in meaning between the English words kill and kiss is a result of the exchange of the phoneme /l/ for the phoneme /s/. Two words that differ in meaning through a contrast of a single phoneme are called minimal pairs. Some linguists (such as Roman Jakobson and Morris Halle) proposed that phonemes may be further decomposable into features, such features being the true minimal constituents of language.[7]