International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 02 Issue: 02 | May-2015 www.irjet.net p-ISSN: 2395-0072
© 2015, IRJET.NET- All Rights Reserved Page 444
Development and Implementation of Algorithm for Speaker
recognition for Gujarati Language
Jigarkumar Patel
1
, Arun Nandurbarkar
2
1
PG student, Electronics and Communication, L.D college of engg, Gujarat, India
2
Associate Professor, Electronics and Communication, L.D college of engg, Gujarat, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Modern speech understanding systems
merge interdisciplinary technologies from Signal
Processing, Pattern Recognition, Natural Language and
Linguistics into a unified statistical framework. In this
paper weighted MFCC(Mel frequency cepstral
coefficients) and GMM(Gaussian Mixture Model) are
implemented for Speaker Recognition in Gujarati
Language. The experimental database consists of 30
speakers, 10 female and 20 male, collected in sound
proof room. The result of this experiment certificates
that this technique works better for speaker
recognition for Gujarati language than only traditional
MFCC with GMM.
Key Words: speaker recognition; Mel frequency
cepstral coefficients; feature extraction; weighted Mel
frequency cepstral coefficients; Gaussian Mixture
Model; maximum likelihood.
1. INTRODUCTION
Modern speech understanding systems merge
interdisciplinary technologies from Signal Processing,
Pattern Recognition, Natural Language and Linguistics
into a unified statistical framework. These systems, which
have applications in a wide range of signal processing
problems, represent a revolution in Digital Signal
Processing(DSP)[1][2]. Once a field dominated by vector-
oriented processors and linear algebra bases mathematics,
the current generation of DSP-based systems rely on
sophisticated statistical models implemented using a
complex software paradigm. Such systems now capable of
understanding continuous speech input for vocabularies
of several thousand words in operational environments.
Speech signal processing technology is an indispensable
technology in the information society, and speaker
recognition is an important research field of speech
processing. Speaker recognition is also called the
voiceprint recognition, which makes it possible to identify
or verify the identity of the speaker using the speech
feature. It combines the theories of various subjects, such
as acoustics, phonetics, linguistics, physiology, digital
signal processing, pattern recognition and artificial
intelligence etc. Speaker recognition has a wide
application prospect in the judicial identification, security
Monitoring, e-commerce and other fields. The extraction
of the Mel frequency cepstral coefficients is one of the
popular approaches of feature extraction.
Speaker modeling is the main part of a speaker
recognition system. The Gaussian mixture model (GMM) is
the most common approach for speaker modeling in text-
independent speaker recognition[4][5]. A general speaker
recognition system, shown in Figure 1, consists mainly, of
three stages, each stages are explained in next sections.
Models
speech Xt Speaker
wave ID
Figure 1. Speaker Recognition System[5].
2. The Feature Extraction
2.1 MFCC(Mel Frequency Cepstral Coefficients)
The purpose of feature extraction is to convert the speech
waveform to a set of features for further analysis. Where
appropriate information is estimated in a suitable form
and size, from the speech signal to obtain a good
representation of the speaker features, (Mel Frequency
Cepstral Coefficients (MFCC features) are chosen in this
paper because they are based on the perceptual
characteristics of the human auditory system[4], figure 2
shows a block diagram of the steps in Mel feature
extraction.
Figure 2. MFCC feature extraction block diagram
Feature
Extraction
Classification
Training
mel
cepstrum
mel
spectrum
frame continuous
speech
Frame
Blocking
Windowing FFT spectrum
Mel-frequency
Wrapping
Cepstrum