International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-10, August 2019 2325 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number J87650881019/2019©BEIESP DOI: 10.35940/ijitee.J8765.0881019 A Robust Isolated Automatic Speech Recognition System using Machine Learning Techniques Sunanda Mendiratta, Neelam Turk, Dipali Bansal Abstract: In order to make fast communication between human and machine, speech recognition system are used. Number of speech recognition systems have been developed by various researchers. For example speech recognition, speaker verification and speaker recognition. The basic stages of speech recognition system are pre-processing, feature extraction and feature selection and classification. Numerous works have been done for improvement of all these stages to get accurate and better results. In this paper the main focus is given to addition of machine learning in speech recognition system. This paper covers architecture of ASR that helps in getting idea about basic stages of speech recognition system. Then focus is given to the use of machine learning in ASR. The work done by various researchers using Support vector machine and artificial neural network is also covered in a section of the paper. Along with this review is presented on work done using SVM, ELM, ANN, Naive Bayes and kNN classifier. The simulation results show that the best accuracy is achieved using ELM classifier. The last section of paper covers the results obtained by using proposed approaches in which SVM, ANN with Cuckoo search algorithm and ANN with back propagation classifier is used. The focus is also on the improvement of pre-processing and feature extraction processes. Keywords: Speech recognition system, SVM, kNN, ANN, Cuckoo search optimization, ELM I. INTRODUCTION Ability to communicate is one of the most fundamental aspects of human behaviour. Through natural languages human communicate with each other verbally and in written form. Human communication written format is represented by vocalized form of human communication i.e., speech [1]. A high quality human computer interactive system has been developed by advancement in language and speech technologies. It has broad applications in education, entertainment and business and to make man-machine communication more user friendly human-computer interfaces are designed in which natural languages are used for interaction between users and machines [2]. As in case of human-human communication a loop of interaction is defined by flow of information between computer and human. The vocalized form of natural language speech or text make possible to communicate and vocalized form of human speech or communication is a most convenient way for human communication. It will lead to speech recognition Revised Manuscript Received on August 09, 2019. * Correspondence Author Sunanda Mendiratta * , Department of Electronics Engineering, J. C. Bose UST, Faridabad, India. E-mail: sunanda.mendiratta@gmail.com Neelam Turk, Department of Electronics Engineering, J. C. Bose UST, Faridabad, India. Dipali Bansal, ECE Department, FET, Manav Rachna International Institute of Research and Studies, Faridabad, India. system development and the machine understands the meaning of human speech. This is a difficult problem and relatively active area of research. The translation of spoken works into respective written scripts is done by speech recognition and language of speech is identified using Automated speech recognition (ASR) system and then in a respective natural language the segments of input speech is converted into respective units of text. By this an interaction between human and computer has become easier and systems have become user friendly [3]. And long term goal of HCI is minimizing the barrier between humans mental model. This model is on what they want to accomplish and computers support of the user’s task. Preparation of structured documents, aircraft, data entry, speech to text processing and voice dialling like voice user interfaces are possible speech recognition applications in HCI. Helping persons to develop fluency with their speaking skills and listening to the proper pronunciation are used for learning different languages in ASR technology [4]. By use of speech to text programs physically disabled students can who suffer from strain injury to upper extremities be relieved to worry about handwriting. Without physically operating a keyboard or mouse, a computer can be use at home to search on internet by utilizing the speech recognition technology. Without the concern of spelling and other writing mechanics a students with learning disabilities can write better by the concept of speech recognition [2]. To facilitate the communication between machines and humans ASR can be used and in various applications a man- machine interaction and speed based applications are demonstrated. Communication interfaces for people with special abilities, translation devices, hands-free machine operations, dictation systems and voice-mail systems in telephony are its applications. On other hand noise free environment, vocabulary and language, low talking rates and speaker dependency are some of its limitations. So, to improve the results work has been done in this field by various researchers [5]. In the context of isolate word recognition (IWR) basic idea behind ASR can be explored. Independent of environment, speaker and device a conversion of speech signal into its equivalent text message is the goal of ASR [6]. It is a problem of pattern recognition in which features are extracted and a model is used for training and testing. This paper is divided into various sections in which second section gives brief introduction of ASR architecture. The third section contains the brief details about machine learning and its use in ASR. This section also contains the review on use of SVM and ANN for speech recognition system.