Vol.:(0123456789) 1 3
Int J Speech Technol
DOI 10.1007/s10772-017-9406-4
Speech enhancement using MMSE estimation under phase
uncertainty
Ravikumar Kandagatla
1
· P. V. Subbaiah
2
Received: 9 December 2016 / Accepted: 25 March 2017
© Springer Science+Business Media New York 2017
terms of objective performance measure segmental signal
to noise ratio, phase signal to noise ratio, perceptual evalu-
ation of speech quality, short time objective intelligibility.
Keywords Speech enhancement · Von misses
distribution · Generalized gamma distribution · Noise
reduction · Phase uncertainty
1 Introduction
In mobile communication, the background noise shows
adverse efect on speech signal. To challenge such back-
ground noise, algorithms for noise reduction plays impor-
tant role. In this regard, single channel speech enhancement
(improvement in quality and intelligibility) algorithms
which uses Bayesian estimators plays important role
among traditional speech enhancement techniques. Bayes-
ian estimators of clean speech, estimate the clean speech by
assuming priors of speech and noise components. Bayesian
estimators estimate either complex speech spectral coef-
cients, let it be S or estimate the real-valued clean speech
amplitude, let it be A. Previously diferent estimators are
derived which incorporate Gaussian, non-Gaussian speech
priors (Martin 2005) and compressed amplitude (You
et al. 2005; Breithaupt et al. 2008) (for better perceptual
enhanced speech).
Traditional approaches process only spectral ampli-
tudes (Hendriks 2013), but noisy phase is unprocessed and
unprocessed noisy phase is used while reconstruction. It is
noted in Paliwal et al. (2011) that the phase plays important
role in single channel speech enhancement. Also in Kraw-
czyk and Gerkman (2014), Mowlaee and Kulmer (2015)
it was proposed how to estimate clean speech spectral
phase. The phase of clean speech is reconstructed and the
Abstract Most of the speech enhancement algorithms
process the amplitudes of speech, but the phase of noisy
speech is left unprocessed as it may cause undesired arti-
facts. Recently, short time Fourier transform based single
channel speech enhancement algorithms are developed
by considering uncertain prior knowledge of phase. The
uncertain knowledge of the phase is obtained from the
phase reconstruction algorithms. The goal of this paper is
to develop joint minimum mean square error estimate of
complex speech coefcients given uncertainty phase (CUP)
information by considering Nagakami probability density
function (PDF) and gamma PDF as speech spectral ampli-
tude priors and generalized gamma PDF for noise prior.
Estimators like amplitudes given uncertainty phase, which
uses uncertain phase only for amplitude estimation and not
for phase improvement are developed. Experimental results
shows that incorporating uncertain phase information
improves quality and intelligibility of speech. Also novel
phase-blind estimators are developed using Nagakami PDF/
gamma as speech priors and generalized gamma as noise
prior. Finally comparison of all estimators using uncertain
prior phase information is discussed and how initial phase
information afects the enhancement process is analyzed
with novel estimators. For comparison of all the derived
estimators, the speech signals uttered by male and female
speakers are taken from TIMIT database. The proposed
CUP estimators outperforms the existing algorithms in
* Ravikumar Kandagatla
2k6ravi@gmail.com
1
Lakireddy Baliredy Engineering College,
Mylavaram, Krishna District, Andhra Pradesh, India
2
Velagapudi Siddhartha Engineering College,
Vijayawada, Krishna District, Andhra Pradesh, India