Int J Speech Technol
DOI 10.1007/s10772-014-9228-6
Manifold learning based speaker dependent dimension reduction
for robust text independent speaker verification
Davood Zabihzadeh · Mohammad H. Moattar
Received: 12 September 2013 / Accepted: 6 February 2014
© Springer Science+Business Media New York 2014
Abstract Speaker verification has been studied widely
from different points of view, including accuracy, robust-
ness and being real-time. Recent studies have turned toward
better feature stability and robustness. In this paper we study
the effect of nonlinear manifold based dimensionality reduc-
tion for feature robustness. Manifold learning is a popu-
lar recent approach for nonlinear dimensionality reduction.
Algorithms for this task are based on the idea that each data
point may be described as a function of only a few parame-
ters. Manifold learning algorithms attempt to uncover these
parameters in order to find a low-dimensional representation
of the data. From the manifold based dimension reduction
approaches, we applied the widely used Isometric mapping
(Isomap) algorithm. Since in the problem of speaker veri-
fication, the input utterance is compared with the model of
the claiming client, a speaker dependent feature transforma-
tion would be beneficial for deciding on the identity of the
speaker. Therefore, our first contribution is to use Isomap
dimension reduction approach in the speaker dependent con-
text and compare its performance with two other widely used
approaches, namely principle component analysis and factor
analysis. The other contribution of our work is to perform the
nonlinear transformation in a speaker-dependent framework.
We evaluated this approach in a GMM based speaker veri-
fication framework using Tfarsdat Telephone speech dataset
for different noises and SNRs and the evaluations have shown
reliability and robustness even in low SNRs. The results also
D. Zabihzadeh (B )
Department of Computer Engineering,
Asrar Institute of Higher Education, Mashhad, Iran
e-mail: d.zabihzadeh@gmail.com; d-zabihzadeh@asrar.ac.ir
M. H. Moattar
Department of Software Engineering, Mashhad Branch,
Islamic Azad University, Mashhad, Iran
e-mail: moattar@mshdiau.ac.ir
show better performance for the proposed Isomap approach
compared to the other approaches.
Keywords Noise robust speaker recognition ·
Text independent speaker verification · Dimension
reduction · Manifold learning
1 Introduction
This paper concerns about improving the robustness of text-
independent speaker verification. It is known that any mis-
match between the training and testing conditions decreases
the accuracy of speaker recognition. The main focus of
speaker recognition research has been to tackle this mis-
match. It is possible to use generic noise suppression tech-
niques to enhance the quality of the signal, however, enhance-
ment techniques increase the computational load of speaker
verification and it is more desirable to develop a robust fea-
ture extraction approach.
This paper intends to study the effect of speaker dependent
nonlinear dimension reduction techniques in speaker verifi-
cation performance improvement especially in noisy condi-
tions. We suppose that this type of approaches can get the
intrinsic layout of speech data and can make the feature vec-
tor invariant against noisy condition. From the vast number
of nonlinear approaches for dimension reduction, manifold
based techniques are proposed in this paper. Manifold learn-
ing is a popular recent approach for nonlinear dimension-
ality reduction (Huo et al. 2007; Lee and Verleysen 2010).
These algorithms for this task are based on the idea that we
can describe the data points as a combination of fewer basis
vectors. Manifold learning approaches are very diverse and
each of them has some specific characteristics. From these
123