Int J Speech Technol (2013) 16:333–339
DOI 10.1007/s10772-012-9186-9
MCRA noise estimation for KLT-VRE-based speech enhancement
Adda Saadoune · Abderrahmane Amrouche ·
Sid Ahmed Selouani
Received: 13 May 2012 / Accepted: 8 December 2012 / Published online: 5 January 2013
© Springer Science+Business Media New York 2013
Abstract A new signal subspace-based approach is pro-
posed for the enhancement of speech corrupted by a high
level of noise. Conventional subspace-based methods use
the minimum mean square error criterion to optimize the
Karhunen-Loève Transform (KLT). In non-stationary noisy
environments, the selection of the optimal order of the KLT-
based speech enhancement model is a critical issue. In-
deed, estimation of the relevant subspace dimensions de-
pends on the environmental conditions that may change
unpredictably. Therefore, a drastic KLT-based dimension
reduction may induce the loss of relevant components of
speech and conversely, a reconstruction using a higher or-
der of the KLT model will be ineffective to remove the
noise. The method presented in this paper uses a Variance
of Reconstruction Error (VRE) criterion to optimally select
the KLT order model. A prominent point of this subspace
method is that it incorporates the Minima Controlled Recur-
sive Averaging (MCRA) to estimate the noise Power Spec-
tral Density (PSD) used in the gain function. Three vari-
ants of the VRE combined with MCRA methods are im-
plemented and compared, namely the VRE-MCRA, VRE-
MCRA2 and VRE-IMCRA. Objective measures show that
VRE-based approaches achieve a lower signal distortion and
A. Saadoune · A. Amrouche
LCPTS, FEI, USTHB, B.P. 32 El Alia, Bab Ezzouar 16111,
Algeria
A. Saadoune
e-mail: adda.saadoune@umoncton.ca
A. Amrouche
e-mail: namrouche@usthb.dz
A. Saadoune · S.A. Selouani ( )
University of Moncton, 218 Boul. J-D. Gauthier, Shippagan,
NB E8S 1P6, Canada
e-mail: sid-ahmed.selouani@umoncton.ca
a higher noise reduction than existing enhancement meth-
ods.
Keywords Subspace speech enhancement ·
Karhunen-Loève transform · Variance of the reconstruction
error · Minima controlled recursive averaging · IMCRA ·
MCRA2
1 Introduction
In the last decades, speech enhancement attracted a lot of
interest and has been largely studied to tackle the prob-
lem of noise reduction in adverse conditions (Loizou 2007).
Speech enhancement techniques aim at improving the qual-
ity and intelligibility of speech that has been degraded by
noise. It has been successfully used in a wide variety of
problems such as correction of reverberation, restoration
of hyperbaric speech, correction of disrupted speech, pitch
and rate modification, but noise reduction is probably the
issue that received the most attention. Speech enhance-
ment can be seen as a process that aims at designing an
optimal filter that can effectively reduce the noise effect
without introducing perceptual speech distortion. Several
techniques have been proposed in the literature for speech
enhancement. These techniques can be roughly classified
into four main categories: spectral subtractive, statistical-
model-based, perceptual-based and subspace decomposition
techniques. Each of these approaches comes with its own
drawbacks, particularly when they face the non-stationary
noise.
Spectral subtraction (SS) was one of the earliest meth-
ods used for speech enhancement. Its principle consists of
estimating the noise spectrum during periods of silence and