Automatic Preprocessing Technique for Detection of Corrupted Speech Signal Fragments for the purpose of Speaker Recognition Konstantin Simonchik 1 , Sergei Aleinik 1 , Dmitry Ivanko 1 1 ITMO University, 49 Kronverkskiy pr., St. Petersburg, 197101, Russia {simonchik, aleinik, ivanko}@speechpro.com Abstract. In this paper we propose a preprocessing technique which allows to detect clicks, tones, overloads, clipping, etc., as well as to discover the parts of good-quality speech signal. As a result the performance of the speaker recogni- tion system increases significantly. It should be noted that when describing noise detectors we aim only to provide a full list of algorithms we used as well as their parameters that we obtained in our experiments. The main goal of the paper is to demonstrate that using a set of simple detectors is very effective in detecting speech for speaker recognition task under the conditions of real noise. Keywords: Preprocessing, Speaker recognition, Speech processing. 1 Introduction At the input of a speaker identification system we often have a mixture of distorted speech signal with various additive noises. Two classic approaches to handling this problem are well-known. The first is increasing the algorithms robustness: feature compensation, model adaptation, score normalization, etc. The second is using differ- ent noise cancellation techniques at the input stage. In practice that real-world audio signals in many cases are only partially (not totally) corrupted. For example, GSM- bursts or telephone bells are short-time noises that may be detected easily. So it is clear that a preprocessing technique which detects corrupted (i.e. distorted or with a high level of noise) fragments of input signals and then remove them from the next stages of processing may be useful for the automatic speaker recognition. 2 Preprocessing technique The structure of the proposed preprocessing is presented in Fig. 1, and contains different components, such as three levels of detectors, two resampling units and two logic units. An important point is that the goal of the order and connection of the de- tectors is to avoid the effects of mutual and resampling influence. Indeed, resampling smoothes short impulses and sharp power bursts, which leads to poor clipping, clicks and overloadings detection.