ISSN: 2319-8753 International Journal of Innovative Research in Science, Engineering and Technology (An ISO 3297: 2007 Certified Organization) Vol. 3, Issue 12, December 2014 DOI: 10.15680/IJIRSET.2014.0312034 Copyright to IJIRSET www.ijirset.com 18006 A Comparative Study of Feature Extraction Techniques for Speech Recognition System Pratik K. Kurzekar 1 , Ratnadeep R. Deshmukh 2 , Vishal B. Waghmare 2 , Pukhraj P. Shrishrimal 2 M.Tech Student, Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad (MS), India ABSTRACT: The automatic recognition of speech means enabling a natural and easy mode of communication between human and machine. Speech processing has vast applications in voice dialing, telephone communication, call routing, domestic appliances control, Speech to Text conversion, Text to Speech conversion, lip synchronization, automation systems etc. Here we have discussed some mostly used feature extraction techniques like Mel frequency Cepstral Co-efficient (MFCC), Linear Predictive Coding (LPC) Analysis, Dynamic Time Wrapping (DTW), Relative Spectra Processing (RASTA) and Zero Crossings with Peak Amplitudes (ZCPA).Some parameters like RASTA and MFCC considers the nature of speech while it extracts the features, while LPC predicts the future features based on previous features. KEYWORDS: Speech Recognition, Feature Extraction, Linear Predictive Coding (LPC), Mel Frequency Cepstrum Coefficient (MFCC), Zero Crossings with Peak Amplitudes (ZCPA), Dynamic Time Wrapping (DTW), Relative Spectra Processing (RASTA). I. INTRODUCTION Speech is the most common form of communication among the human beings. There are various languages in the world that are spoken by human beings for communication [1]. Researchers are trying to develop the system which can analyze and classify the speech signal [2]. The computers system which can understand the spoken language can be very useful in various areas like agriculture, health care and government sectors etc. Speech recognition refers to the ability of listening spoken words and identifies various sounds present in it, and recognizes them as words of some known language [3]. Speech signals are quasi-stationary signals. When speech signals are examined over a short period of time (5-100 msec), its characteristics are stationary; but, for a longer period of time the signal characteristics changes; it reflects to the different speech sounds being spoken. Features are extracted from the speech signals on the basis of short term amplitude spectrum (phonemes). Feature extraction is the most important phase in speech recognition system. There are some problems which are faced during the feature extraction process because of the variability of the speakers [4]. This paper gives the comparative study of some of the mostly used feature extraction techniques for Speech Recognition system. The rest of the paper is divided as follows: section 2 describes what is meant by feature extraction, section 3 describes commonly used feature extraction techniques, and section 4 compares the feature extraction techniques. Conclusion is mentioned in Section 5. II. RELATED WORK Over the years a number of different methodologies have been proposed for isolated word and continuous speech recognition. These can usually be grouped in two classes: speaker-dependent and speaker-independent. Speaker dependent methods usually involve training a system to recognize each of the vocabulary words uttered single or multiple times by a specific set of speakers [5, 6] while for speaker independent systems such training methods are generally not applicable and words are recognized by analyzing their inherent acoustical properties [7,8]. Various features have been used singly or in combination with others to model the speech signals, ranging from Linear