International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 04 | Apr-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET ISO 9001:2008 Certified Journal Page 709
A Survey on Speaker Recognition With Various Feature Extraction And
Classification Techniques
Jyoti B. Ramgire
1
, Prof. Sumati M.Jagdale
2
1
PG Student, Dept. Of Electronics and Telecommunication Engineering, Bharati Vidyapeeth’s College of
Engineering for Women, Pune 43, Maharashtra, India
2
Associate Professor, Dept. Of Electronics and Telecommunication Engineering, Bharati Vidyapeeth’s College of
Engineering for Women, Pune 43, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Speech processing is more popular day by day
for providing immense security. Authentication purpose
speech is widely used. Speaker recognition is the process which
can verify and identify the person who is speaking. Speech
recognition system is different than speaker recognition
system. Speaker recognition are widely used in industries,
hospital, laboratory etc. Its advantages are more secure, easy
implementation, more user friendly. For the area where
security is very important, speaker recognition technique is
one of the most widely used technique. It is also popular
biometric technique. This paper describes an overview of
different techniques that can be used in application of speaker
recognition such as LPC, LPCC MFCC etc. Also discuss on
different classifiers such as DTW, GMM, VQ, SVM. The main
objective of this review paper is to summarize well known
techniques for speaker recognition system.
Key Words: Speaker recognition, Mel frequency cepstral
coefficients(MFCC), Linear predictive coding (LPC),
Linear Predictive Cepstral Coefficients (LPCC), Gaussian
Mixture Model(GMM), Vector Quantization(VQ), Support
Vector Machine(SVM), Dynamic Time Warping(DTW)
1. INTRODUCTION
Speech signal contains different levels of
information[14]. Speech signal can be used for speech
recognition, speaker recognition or voice command
recognition system[3]. Speaker recognition is used for
many speech processing applications especially security and
authentication. Today security is major requirement.
Sometimes there may be confusion regarding speech and
speaker recognition. Speaker recognition and speech
recognition are very closely related systems but these two
systems are different[14]. Speech recognition is the process
of recognizing what is being said and speaker recognition is
the process of recognizing who is speaking. Speech
recognition has ability to automatically recognizing the
spoken words of person based on information in speech
signal[3].. Speaker recognition is classified as speaker
identification and verification. The main aim of speaker
recognition is to identify the speaker by extraction,
characterization and recognition of the information
contained in speech signal[14]. Speech recognition consist
of speaker dependent and speaker independent.
The human speech is processed by machine depending on
feature extraction and feature matching. Basic model of
speaker recognition is shown in Figure 1[3].
Fig -1: Basic model of Speaker Recognition system
Speaker recognition process is done in three steps. First is
pre-processing is used to remove silent period from speech
signal[3]. In speaker recognition, the feature is extracted
using different techniques such as Linear predictive
coding(LPC), Linear Predictive Cepstral Coefficients (LPCC),
Mel frequency cepstral coefficients MFCC. For feature
classification different classifiers are used such as Support
Vector Machine (SVM), Vector Quantization(VQ), Gaussian
Mixture Model(GMM), Dynamic Time Warping(DWT).
2. RELATED WORK
Table -1: Literature Survey
Author
Name
Feature
Extraction
Classifiers Advantages
V.
Tiwari
et.al.[1]
LPC.LDB,
MFCC
VQ 1. MFCC with
hanning window
using 32 filter has
more efficiency.
2. Density matching
property of vector
quantization is
powerful
K. Kaur,
et.al.[2]
LPC, LPCC
MFCC
VQ, GMM,
SVM,DWT
,HMM
1.MFCC technique is
more consistent with
human hearing as
compared to LPCC,
MFCC.
2. GMM is best