Advanced Research in Electrical and Electronic Engineering p-ISSN: 2349-5804; e-ISSN: 2349-5812 Volume 2, Issue 14 October-December, 2015, pp. 42-45 © Krishi Sanskriti Publications http://www.krishisanskriti.org/Publication.html Visual Speech Recognition through Zernike Moments Promila Singh 1 , A.N Mishra 2 and Usha Sharma 3 1 M.tech Student, Department of Electronics and Communication Engineering, Krishna Engineering College, Ghaziabad, U.P 2 Department of Electronics and Communication Engineering, Krishna Engineering College, Ghaziabad, U.P 3 Department of Electronics and Communication Engineering, JRE Group Of Institution, Greater Noida, U.P E-mail: 1 promisingh22@gmail.com, 2 an_mishra53@rediffmail.com, 3 ushasharma1529@gmail.com Abstract—This paper presents a new learning- based representation that is referred to as Visual Speech Recognition through Zernike Moments. The automated recognition of human speech using only features from the visual domain has become a significant research topic that plays an essential role in the development of many multimedia systems such as audio visual speech recognition (AVSR), mobile phone applications, human - computer interaction (HCI) and sign language recognition. The inclusion of the lip visual information is opportune since it can improve overall accuracy of audio or hand recognition algorithms especially when such systems are operated in environments characterized by a high level of acoustic noise. The main components of the developed Visual Speech Recognition system are applied to: (a) segment the mouth region of interest, (b) extract the visual features from the real time input video image and (c) to identify the Zernike moments. The major difficulty associated with the VSR systems resides in the identification of the smallest elements contained in the image sequences that represent the lip movements in the visual domain. The objective of visual speech recognition system is to improve their recognition accuracy. In this paper we computed visual features using Zernike moments on visual vocabulary of independent standard words dataset of Hindi digits of ten speakers. The visual features were normalized and dimension of features set was reduced by principal component analysis (PCA) in order to recognize the isolated word utterance on PCA space . 1. INTRODUCTION In the recent year, there are many automatic speech reading system proposed that combine audio as well as visual speech features. In computer speech recognition visual component of speech is used for support of acoustic speech recognition. Design of an audio- visual speech recognizer is based on human lip- reading expert experiences. Hearing impaired people achieve recognition rate of 60–80 % in dependence on lip reading conditions. Most important conditions for good lip- reading are quality of visual speech of a speaker (proper articulation) and angle of view. Sometimes people, who are well understood from acoustic component may be not well lip- read but for hearing impaired or even deaf people visual speech component is important source of information. Lip- reading (visual speech recognition) is used by people without disabilities, too. It helps better understanding in case when the acoustic speech is less intelligible. Task of automatic speech recognition by a computer, when visual component of speech is used has attracted many researcher to contribute in automatic Audio-Visual Speech Recognition domain. This is challenging because of the visual articulations vary with speaker to speaker and can contain very less information as compared to acoustic signal therefore identification of robust features is still center of attraction of many researchers. In recent year, there have been many advances in automatic speech reading system with the inclusion of audio and visual speech features to recognize words under noisy conditions. The objective of visual speech recognition system is to improve recognition accuracy. In paper we extract visual features using Zernike moments on visual vocabulary of independent standard dataset of Hindi digits of ten speakers. The visual features were normalized and dimension of features set was reduced by principal component analysis (PCA) in order to recognize the isolated word utterance on PCA space. 2. 2. ZERNIKE POLYNOMIALS Set of orthogonal polynomials defined on the unit disk.  (, )=  ()  Often, to aid in the interpretation of optical test results it is convenient to express wavefront data in polynomial form. Zernike polynomials are often used for this purpose since they are made up of terms that are of the same form as the types of aberrations often observed in optical tests (Zernike, 1934). This is not to say that Zernike polynomials are the best polynomials for fitting test data. Sometimes Zernike polynomials give a poor representation of the wavefront data. For example, Zernikes have little value when air turbulence is present. Likewise, fabrication errors in the single point diamond turning process cannot be represented using a reasonable number of terms in the Zernike polynomial. In the testing of conical optical elements, additional terms must be added to Zernike polynomials to accurately represent alignment errors. The blind use of Zernike polynomials to