1 MACHINE LEARNING APPROACH FOR VOICED/UNVOICED/SILENCE SPEECH SEGMENT DETECTION Mohammed Abebe, moshethio@gmail.com 1 Sebsibie Hailemariam (PhD), sebsibe2004@yahoo.com 2 1 Department of Computer Science & IT, Arba Minch University, Arba Minch, Ethiopia 2 Department of Computer Science, Addis Ababa University, Addis Ababa, Ethiopia ABSTRACT In this study a supervised method of voiced/unvoiced/silence speech segment detection is developed on Amharic speech corpus. A total of 900 Amharic sentences are collected which are acoustically tagged with three possible acoustic tags (voiced, unvoiced and silence). A speech corpus is prepared by recording these 900 sentences. Then the speech signal is segmented at 25 ms frame length with no overlapping. The researchers extracted frame energy, number of zero crossings (ZCR), Linear Prediction Coefficients (LPC), Mel Frequency Cepstral Coefficient (MFCC), Mel Frequency Cepstral Coefficient-Principal Component Analysis (MFCC-PCA) features from each frame and combined in six different forms. The features were further processed to generate feature inventory for classification purpose by assigning the dependent variable (class label) which is the three values (voiced, unvoiced or silence). Using this feature, we train five different classification models: two rule-based (Decision Tables and JRip), two decision trees (J48 and simple CART) and a neural network or MLP (with one and two hidden layers). The MLP with one hidden layer shows the highest performance using the combined feature (MFCC, Energy and ZCR) with a performance of 89.33%. Further, the researchers attempted to tune the parameters on MLP by changing the frame size, the learning rate, number of hidden layers and the number of neurons per hidden layer of the MLP. Finally, the experiments show that best performance with the selected classifier model and feature vector is achieved on 35 ms frame size with an accuracy of 89.69%. Keywords: Voicing Detection, Voiced, Unvoiced, Silence, Machine Learning, Speech Segment Detection 1. INTRODUCTION The knowledge of acoustic speech feature in particular voiced or unvoiced segment plays an important role in many speech analysis-synthesis systems. Thus the issue of voicing detection (Voiced/Unvoiced/Silence) algorithms (VDAs) has been one of the topics most analyzed in the field of speech processing research during the last three decades (Beritelli, F., et al, 2009). Voiced/Unvoiced/Silence (VUS) speech segment detection is a method of assigning and labeling a specific speech category (voiced/unvoiced/silence) to a speech segment. An accurate classification of a speech segment with voicing detection is often used as a prerequisite for developing other higher level and efficient applications of speech processing systems such as speech coding, speech analysis, speech synthesis, automatic speech recognition, noise suppression and enhancement, speaker identification, and the recognition of speech pathologies. Voiced/Unvoiced detection involves identifying the regions of speech when there is significant glottal activity (i.e., the vibration of vocal folds). Such regions of speech are generally referred to as voiced speech (Dhananjaya, N., & B.Yegnanarayana, 2010). Voiced speeches include all vowels and some consonants, such as /m/, /n/, /l/, /w/, /b/, /d/, /g/, /v/, and /z/. Unvoiced speeches are produced by a turbulent air flow crossing some constriction in the vocal tract, without vibration of the vocal cords. Unvoiced sounds include consonants like /p/, /t/, /k/, /f/. Silence is produced as a result of air pressure emanating from lung without any constriction along the path in the speech production system (Ladefoged, P., 2001). 2. APPROACHES FOR VUS DETECTION So far many voicing detection researches have been done and different approaches have been used for (VUS) classification, where the well-known ones are rule-based, statistical and neural network approach. The rule based approach as its name indicates relies on rules which are either handcrafted or machine learned rules. The rules are the important elements