Int J Speech Technol (2013) 16:333–339 DOI 10.1007/s10772-012-9186-9 MCRA noise estimation for KLT-VRE-based speech enhancement Adda Saadoune · Abderrahmane Amrouche · Sid Ahmed Selouani Received: 13 May 2012 / Accepted: 8 December 2012 / Published online: 5 January 2013 © Springer Science+Business Media New York 2013 Abstract A new signal subspace-based approach is pro- posed for the enhancement of speech corrupted by a high level of noise. Conventional subspace-based methods use the minimum mean square error criterion to optimize the Karhunen-Loève Transform (KLT). In non-stationary noisy environments, the selection of the optimal order of the KLT- based speech enhancement model is a critical issue. In- deed, estimation of the relevant subspace dimensions de- pends on the environmental conditions that may change unpredictably. Therefore, a drastic KLT-based dimension reduction may induce the loss of relevant components of speech and conversely, a reconstruction using a higher or- der of the KLT model will be ineffective to remove the noise. The method presented in this paper uses a Variance of Reconstruction Error (VRE) criterion to optimally select the KLT order model. A prominent point of this subspace method is that it incorporates the Minima Controlled Recur- sive Averaging (MCRA) to estimate the noise Power Spec- tral Density (PSD) used in the gain function. Three vari- ants of the VRE combined with MCRA methods are im- plemented and compared, namely the VRE-MCRA, VRE- MCRA2 and VRE-IMCRA. Objective measures show that VRE-based approaches achieve a lower signal distortion and A. Saadoune · A. Amrouche LCPTS, FEI, USTHB, B.P. 32 El Alia, Bab Ezzouar 16111, Algeria A. Saadoune e-mail: adda.saadoune@umoncton.ca A. Amrouche e-mail: namrouche@usthb.dz A. Saadoune · S.A. Selouani ( ) University of Moncton, 218 Boul. J-D. Gauthier, Shippagan, NB E8S 1P6, Canada e-mail: sid-ahmed.selouani@umoncton.ca a higher noise reduction than existing enhancement meth- ods. Keywords Subspace speech enhancement · Karhunen-Loève transform · Variance of the reconstruction error · Minima controlled recursive averaging · IMCRA · MCRA2 1 Introduction In the last decades, speech enhancement attracted a lot of interest and has been largely studied to tackle the prob- lem of noise reduction in adverse conditions (Loizou 2007). Speech enhancement techniques aim at improving the qual- ity and intelligibility of speech that has been degraded by noise. It has been successfully used in a wide variety of problems such as correction of reverberation, restoration of hyperbaric speech, correction of disrupted speech, pitch and rate modiﬁcation, but noise reduction is probably the issue that received the most attention. Speech enhance- ment can be seen as a process that aims at designing an optimal ﬁlter that can effectively reduce the noise effect without introducing perceptual speech distortion. Several techniques have been proposed in the literature for speech enhancement. These techniques can be roughly classiﬁed into four main categories: spectral subtractive, statistical- model-based, perceptual-based and subspace decomposition techniques. Each of these approaches comes with its own drawbacks, particularly when they face the non-stationary noise. Spectral subtraction (SS) was one of the earliest meth- ods used for speech enhancement. Its principle consists of estimating the noise spectrum during periods of silence and