I.J. Image, Graphics and Signal Processing, 2015, 6, 29-37 Published Online May 2015 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijigsp.2015.06.04 Copyright © 2015 MECS I.J. Image, Graphics and Signal Processing, 2015, 6, 29-37 Dominant Frequency Enhancement of Speech Signal to Improve Intelligibility and Quality Premananda B.S. Department of Telecommunication, R.V. College of Engineering, Bengaluru, India Email: premanandabs@rvce.edu.in Uma B.V. Department of Electronics & Communication, R.V. College of Engineering, Bengaluru, India Email: umabv@rvce.edu.in Abstract—In mobile devices, perceived speech signal deteriorates significantly in the presence of near-end noise as the signal arrives directly at the listener's ears in a noisy environment. There is an inherent need to increase the clarity and quality of the received speech signal in noisier environment. It is accomplished by incorporating speech enhancement algorithms at the receiver end. The objective is to improve the intelligibility and quality of the speech signal by dynamically enhancing the speech signal when the near- end noise dominates. This paper proposes a speech enhancement approaches by inculcating the threshold of hearing and auditory masking properties of the human ear. Incorporating the masking properties, the speech samples that are audible can be obtained. In low SNR environments, selective audible samples can be enhanced to improve the clarity of the signal rather than enhancing every loud sample. Intelligibility and quality of the enhanced speech signal are measured using Speech Intelligibility Index and Perceptual Evaluation of Speech Quality. Experimental results connote the intelligibility and quality improvement of the speech signal with the proposed method over the unprocessed far-end speech signal. This approach is efficient in overcoming the deterioration of speech signals in a noisy environment. Index Terms—Dominant, Near-end noise, Psychoacoustics, Speech enhancement, Speech intelligibility, Speech quality I. INTRODUCTION Mobile devices are the most popular consumer devices in the present day. For a conversation in a quiet environment, less speech magnitude is required for the speakers to understand each other. However, for instance, if a train passes by, the conversation is severely disturbed. To overcome this effect, we should either wait until the train passes or raise the signal amplitude to produce more speech energy in order to increase the loudness. The external volume control of the mobile phones cannot be used as background noise changes in a dynamic fashion. As the noise signal cannot be mended upon, a reasonable approach is to manipulate the far-end speech signal based on the energy of near-end noise. Hence, the problem necessitates the need for the development of speech enhancement algorithms to improve the speech perception in adverse listening conditions. The nature of the speech enhancement differs depending on specific applications. At the receiving end, referred to as ―near-end‖ in the literature, the listener may be in a noisy environment. It makes hearing difficult, even though, the transmitting speech source is in a reticent environment because the near-end noise hits the listener's ear directly. Listener experiences fatigue as the quality of the speech signal deteriorates. The presence of noise masks the speech signal and makes it less intelligent or audible. This effect is called masking and is of two types, one, simultaneous masking and the other temporal masking. In simultaneous masking, a signal is masked by the presence of another signal (predominantly noise). In temporal masking, the signal is masked by noise before and after the high noise occurs. Hence, the speech signal needs to be enhanced considering these situations in the purview of the problem. The basic idea, of including masking effects in speech signal enhancement, is to remove the non-audible spectral components of the speech signal and the masked signal. Hence, speech enhancement not only involves increasing speech signal for human listening but also for further improvement prior to listening. The objective of signal enhancement is to increase the perceptual aspects of speech such as overall quality, intelligibility, etc. The speech enhancement algorithms should provide superior performance in a broad range of SNRs for both clarity and quality. The effect of far-end noise on speech signal can be tackled by using traditional noise suppression algorithms like minimum mean-square error (MMSE), short-time spectral amplitude (STSA) estimator [18], spectral subtraction methods [20], etc. The approaches proposed for far-end noise reduction techniques discussed in the literature [18-20] are not suitable in the present context as they focus on mitigating noise at the speaker end rather than at the receiver end. Near-end noise cannot be influenced because the listener is located in a noisy environment, and the noise reaches the ears with hardly