International Journal of Computer Applications (0975 – 8887) Volume 175 – No.1, October 2017 31 Survey paper on Different Speech Recognition Algorithm: Challenges and Techniques Ayushi Y. Vadwala B.Tech Student Madhuben & Bhanubhai Patel Women’s Institute Of Engineering, New Vallabh Vidyanagar Gujarat, India Krina A. Suthar B.Tech Student Madhuben & Bhanubhai Patel Women’s Institute Of Engineering, New Vallabh Vidyanagar Gujarat, India Yesha A. Karmakar B.Tech Student Madhuben & Bhanubhai Patel Women’s Institute Of Engineering, New Vallabh Vidyanagar Gujarat, India Nirali Pandya Assistant Professor Madhuben & Bhanubhai Patel Women’s Institute Of Engineering, New Vallabh Vidyanagar Gujarat, India ABSTRACT The Speech is most major & prime mode of Communication among human beings. The communication among human and computer is referred as human computer interface. Speech can be used to commune with computer. The speech recognition research is becoming more and more determined. Today, researchers are trying to making an effort to extend the capabilities of what computers can do with the spoken words. This paper consists of the classification of algorithms through which an uttered word can be converted to computer intelligible form. The challenges in speech recognition will be enumerated and analyzed for the most popular recognition techniques used today. The analysis ends with a brief description of some of the applications of speech recognition. General Terms Algorithms of Speech Recognition. Keywords Speech Recognition, Hidden Markov Model, Artificial Intelligence, Pattern Recognition, Neural Network. 1. INTRODUCTION Speech is the most crucial, widespread and proficient form of communication method for people to commune with each other. Human are comfortable with speech hence persons would also like to interact with computers via speech, rather than via primitive interfaces such as keyboards and pointing devices. Speech Recognition is the inter-disciplinary sub-field of computational linguistics that build up techniques and technologies that facilitates the recognition and translation of spoken words into textual format by computers. It is also known as "speech to text" (STT). It includes knowledge and research in the linguistics, computer science, and electrical engineering fields. The objective of speech recognition is for a computer to be capable to "perceive speech”, “recognize" and "take action upon" spoken words [2][3][4][5]. This linguistic techniques and approaches can be used to develop different type of applications based on speech recognition. The applications with speech recognition feature will also make life easier for those who are physically disabled and every common user who is fascinated by voice recognition. Best example of speech recognition based application is Intelligent voice assistant which takes speech as an input and performs actions consequently on different platforms [1]. Google provides API of speech recognition for android application developers to make their programming easier which uses various techniques which are described in this paper. [22]. 2. CHALLENGES OF SPEECH RECOGNITION SYSTEM An utterance is the speaking of a word. Utterances can be a single word, a few words, a sentence, or even multiple sentences. The types of speech utterance are: 2.1 Utterance approach It implies that how the words are spoken either in isolated or in connected manner. 2.1.1 Isolated words An isolated word speech recognition system necessitates that the speaker provides a brief intermission between words. It doesn't denote that it recognize single words, but does entail a single utterance at a time. This is fine for conditions where the user is obligatory to give only one word responses or commands, however it is extremely aberrant for multiple word inputs. 2.1.2 Connected words Connected word systems (or more correctly 'connected utterances') are analogous to isolated words, nevertheless it allows separate utterances to be 'run-together' with a nominal pause among them. 2.2 Utterance style All humans converse diﬀerently, it is a means of expressing their personality. Not only do they use personal terminologies, they have an unique way to articulate and emphasize. The speaking style also shows a discrepancy in diﬀerent situations. Humans also communicate their emotions via speech. A person converse diﬀerently when he or she is happy, sad, irritated, anxious, upset, self-protective etc. It is majorly divided in two parts that is whether the speech is continuous or spontaneous. 2.2.1 Continuous Speech Continuous speech recognizers permit users to speak roughly naturally, while the computer concludes the content. It includes an immense pact of "co articulation", where adjoining words are spoken together without temporary halts or any other noticeable division between words. Continuous speech recognition systems are extremely complicated to create because they must use extraordinary means to determine utterance boundaries. As the list of vocabulary increases, confusability between different word sequences increases.