Recognition of Greek Phonemes using Support Vector Machines Iosif Mporas, Todor Ganchev, Panagiotis Zervas, Nikos Fakotakis Wire Communications Laboratory, Dept. of Electrical and Computer Engineering University of Patras, 261 10 Rion, Patras, Greece, Tel: +30 2610 997336, Fax: +30 2610 997336 {imporas, tganchev, pzervas, fakotaki}@wcl.ee.upatras.gr Abstract. In the present work we study the applicability of Support Vector Machines (SVMs) on the phoneme recognition task. Specifically, the Least Squares version of the algorithm (LS-SVM) is employed in recognition of the Greek phonemes in the framework of telephone-driven voice-enabled information service. The N-best candidate phonemes are identified and consequently feed to the speech and language recognition components. In a comparative evaluation of various classification methods, the SVM-based phoneme recognizer demonstrated a superior performance. Recognition rate of 74.2% was achieved from the N-best list, for N=5, prior to applying the language model. 1 Introduction The increased interest of the market in multilingual speech-enabled systems, such as telephone-driven information access systems, has raised the necessity of developing computationally efficient and noise-robust speech and language recognition methods. In speech and language recognition, the phonotactic approach became very popular, since it offers a good trade-off between recognition accuracy and amount of data required for training. In brief, in the phonotactic approach the speech signal is decoded to a phoneme sequence, which is further processed by a statistical language model for the language of interest. This technique, proposed by Zissman [1], is known as phoneme recognition followed by language model (PRLM). Due to the success of the phonotactic approach, phoneme recognition became a corner stone in every speech and language recognition component. At present, various approaches to phoneme recognition have been proposed. In [2], a combination of context-dependent and context-independent ANNs has led a phoneme recognition accuracy of about 46%. Phoneme recognition using independent component analysis (ICA)-based feature extraction [3] yielded accuracy of 51%. Continuous mixture HMM-based phoneme recognizer with a conventional three-state left-to-right architecture [4] achieved recognition performance of 54%. A language- dependent approach to phoneme recognition demonstrated accuracy in the range 45% to 55% [5]. Speaker-independent approach, using multiple codebooks of various LPC parameters and discrete HMMs, achieved 65% accuracy on context-independent test