1664 | International Journal of Current Engineering and Technology, Vol.4, No.3 (June 2014) Research Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 - 5161 ©2014 INPRESSCO ® , All Rights Reserved Available at http://inpressco.com/category/ijcet A Systematic Analysis of Automatic Speech Recognition: An Overview Taabish Gulzar Ȧ* , Anand Singh Ȧ , Dinesh Kumar Rajoriya and Najma Farooq Ȧ Ȧ Department of Electronics and Communication, Dehradun Institute of Technology, Mussourie Diversion Road, Makkwala Dehradun, India Department of Electronics and Communication, Sagar Institute of Science, Technology and Engineering, Bhopal, M.P, India Accepted 18 May 2014, Available online 01 June2014, Vol.4, No.3 (June 2014) Abstract Most high-flying and primary means of communication among humans is speech. Despite the researches and developments in the field of automatic speech recognition the accuracy of the said is still a research challenge. This paper reviews past work comparing modern speech recognition systems and humans to determine how far recent dramatic progress in technology has evolved towards the objective of human-like performance. An overview of sources of knowledge is introduced and the use of knowledge to create and verify hypotheses is discussed. Keywords: Automatic speech recognition, Feature Extraction, Utterance, Dynamic time wrapping, Matching. 1. Introduction 1 From previous several decades human beings tried to create technologies that could recognize correct speech. While humans can differentiate speech very easily, they in fact make use of much acoustic, linguistic and contextual information. It has been seen that relation between physical speech signal and the corresponding words is so much complex and very hard to understand. Both the research areas of automatic speech recognition (ASR) and human speech recognition (HSR) observe the recognition process from the acoustic signal to a series of recognized units. For ASR, the objective is to automatically transcribe the speech signal in terms of a sequence of items as close as possible to a reference transcription (L. Rabiner et al, 1993; F. Jelinek, 1997). In HSR, the attention is on understanding how human listeners recognise spoken utterances. On the basis of advances in statistical modelling of speech, automatic speech recognition (ASR) systems find extensive application in tasks that make use of human-machine interface, such as automatic call processing in telephone networks and query-based information systems that provide updated travel information, stock price quotations, weather reports, embedded systems etc. 1.1 Definition and Basic Model of speech recognition Speech Recognition also known as Automatic speech recognition (ASR) is defined as a process of converting a speech signal into a set of words by a certain algorithm that can be implemented as a system program or a process of converting an acoustic signal, captured by a microphone or a telephone, to a set of words (V. Zue et al, 1996; Z. *Corresponding author: Taabish Gulzar Mengjie, 2001). Automatic speech recognition (ASR) is one of the fastest growing areas in the framework of speech science and engineering. Research in speech processing and communication for the most part, was enthused by people’s desire to build mechanical models to follow human verbal communication capabilities. The primary aim of ASR systems is to develop the new techniques and systems for speech input to machines. Mathematical representation of speech recognition system in straightforward equations which contain frontend unit, model unit, language model unit, and search unit is shown in Fig. 1. INPUT SPEECH Fig 1 shows the basic model of speech recognition. One of standard approach to large vocabulary continuous speech recognition is to presume a simple probabilistic model of speech production whereby a specified word set, W, generates an acoustic observation sequence Y, with probability P(W,Y). The objective is then to decode the