David C. Wyld et al. (Eds) : CSITY, SIGPRO, DTMN - 2015
pp. 23–30, 2015. © CS & IT-CSCP 2015 DOI : 10.5121/csit.2015.50303
HINDI DIGITS RECOGNITION SYSTEM ON
SPEECH DATA COLLECTED IN
DIFFERENT NATURAL NOISE
ENVIRONMENTS
Babita Saxena
1
and Charu Wahi
2
Department of Computer Science, Birla Institute of Technology, Noida
babita.gs@gmail.com
charu@bitmesra.com
ABSTRACT
This paper presents a baseline digits speech recognizer for Hindi language. The recording
environment is different for all speakers, since the data is collected in their respective homes.
The different environment refers to vehicle horn noises in some road facing rooms, internal
background noises in some rooms like opening doors, silence in some rooms etc. All these
recordings are used for training acoustic model. The Acoustic Model is trained on 8 speakers’
audio data. The vocabulary size of the recognizer is 10 words. HTK toolkit is used for building
acoustic model and evaluating the recognition rate of the recognizer. The efficiency of the
recognizer developed on recorded data, is shown at the end of the paper and possible directions
for future research work are suggested.
KEYWORDS
HMM, Acoustic Model, Digit Speech Recognition, Grammar
1. INTRODUCTION
In the last few years, Hidden-Markov-Model based (HMM) algorithms have been the most
successful techniques used for speech recognition systems. Using the same, the experiments are
conducted for building a Digit Speech Recognition(DSR) for Hindi. Thus, for building a DSR,
acoustic characteristics like pitch, formant frequencies etc have to be computed. These
characteristics are captured and a model is built based on these. These models are further used
for recognition purposes.
In this paper we present our work on building acoustic model for Hindi Digits. Hindi belongs to
the Indo Aryan family of languages and is written in the devanagari script. There are 11 vowels
and 35 consonants in standard Hindi. In addition, 5 Nukta consonants are also adopted from
Farsi/Arabic sounds.