Vol.:(0123456789) 1 3 Int J Speech Technol DOI 10.1007/s10772-017-9406-4 Speech enhancement using MMSE estimation under phase uncertainty Ravikumar Kandagatla 1 · P. V. Subbaiah 2 Received: 9 December 2016 / Accepted: 25 March 2017 © Springer Science+Business Media New York 2017 terms of objective performance measure segmental signal to noise ratio, phase signal to noise ratio, perceptual evalu- ation of speech quality, short time objective intelligibility. Keywords Speech enhancement · Von misses distribution · Generalized gamma distribution · Noise reduction · Phase uncertainty 1 Introduction In mobile communication, the background noise shows adverse efect on speech signal. To challenge such back- ground noise, algorithms for noise reduction plays impor- tant role. In this regard, single channel speech enhancement (improvement in quality and intelligibility) algorithms which uses Bayesian estimators plays important role among traditional speech enhancement techniques. Bayes- ian estimators of clean speech, estimate the clean speech by assuming priors of speech and noise components. Bayesian estimators estimate either complex speech spectral coef- cients, let it be S or estimate the real-valued clean speech amplitude, let it be A. Previously diferent estimators are derived which incorporate Gaussian, non-Gaussian speech priors (Martin 2005) and compressed amplitude (You et al. 2005; Breithaupt et al. 2008) (for better perceptual enhanced speech). Traditional approaches process only spectral ampli- tudes (Hendriks 2013), but noisy phase is unprocessed and unprocessed noisy phase is used while reconstruction. It is noted in Paliwal et al. (2011) that the phase plays important role in single channel speech enhancement. Also in Kraw- czyk and Gerkman (2014), Mowlaee and Kulmer (2015) it was proposed how to estimate clean speech spectral phase. The phase of clean speech is reconstructed and the Abstract Most of the speech enhancement algorithms process the amplitudes of speech, but the phase of noisy speech is left unprocessed as it may cause undesired arti- facts. Recently, short time Fourier transform based single channel speech enhancement algorithms are developed by considering uncertain prior knowledge of phase. The uncertain knowledge of the phase is obtained from the phase reconstruction algorithms. The goal of this paper is to develop joint minimum mean square error estimate of complex speech coefcients given uncertainty phase (CUP) information by considering Nagakami probability density function (PDF) and gamma PDF as speech spectral ampli- tude priors and generalized gamma PDF for noise prior. Estimators like amplitudes given uncertainty phase, which uses uncertain phase only for amplitude estimation and not for phase improvement are developed. Experimental results shows that incorporating uncertain phase information improves quality and intelligibility of speech. Also novel phase-blind estimators are developed using Nagakami PDF/ gamma as speech priors and generalized gamma as noise prior. Finally comparison of all estimators using uncertain prior phase information is discussed and how initial phase information afects the enhancement process is analyzed with novel estimators. For comparison of all the derived estimators, the speech signals uttered by male and female speakers are taken from TIMIT database. The proposed CUP estimators outperforms the existing algorithms in * Ravikumar Kandagatla 2k6ravi@gmail.com 1 Lakireddy Baliredy Engineering College, Mylavaram, Krishna District, Andhra Pradesh, India 2 Velagapudi Siddhartha Engineering College, Vijayawada, Krishna District, Andhra Pradesh, India