Computers in Biology and Medicine 37 (2007) 571 – 578 www.intl.elsevierhealth.com/journals/cobm Wavelet time-frequency analysis and least squares support vector machines for the identiﬁcation of voice disorders  Everthon Silva Fonseca a , b, ∗ , Rodrigo Capobianco Guido a , Paulo Rogério Scalassara a , Carlos Dias Maciel a , José Carlos Pereira a a SEL/EESC/USP and IFSC/USP—Department of Electrical Engineering, School of Engineering at São Carlos and Institute of Physics at Sao Carlos, University of São Paulo, SP, Brazil b EE/UCLA—School of Engineering and Applied Sciences, University of California at Los Angeles, CA, USA Abstract This work describes a novel algorithm to identify laryngeal pathologies, by the digital analysis of the voice. It is based on Daubechies’ discrete wavelet transform (DWT-db), linear prediction coefﬁcients (LPC), and least squares support vector machines (LS-SVM). Wavelets with different support-sizes and three LS-SVM kernels are compared. Particularly, the proposed approach, implemented with modest computer requirements, leads to an adequate larynx pathology classiﬁer to identify nodules in vocal folds. It presents over 90% of classiﬁcation accuracy and has a low order of computational complexity in relation to the speech signal’s length.  2006 Published by Elsevier Ltd. Keywords: Voice disorders; Wavelet transform; LPC; SVM; Pattern recognition in spoken language 1. Introduction Discrete-time processing of recorded voice signals [1] can be used to detect different acoustical characteristics that differenti- ate between normal and pathologically affected human voices. Pathologies related to the glottal tract are usually identiﬁed through acoustic perceptual standards like breathness, hoarse- ness and harshness [2–4]. However, due to the complex struc- ture of the biological system for speech synthesis, pathologies with harsh characteristics may be confused with those percep- tually deﬁned as hoarse [5]. The turbulence in glottal ﬂow, re- sulting from malfunction of the vocal folds, can be quantiﬁed by the noise in spectral components of speech [6]. Pathologies caused by soft or incomplete closure of the glottis, as nodules in vocal folds, are often associated with high-frequency noise [7,8]. Thus, we intend to analyze this particular high frequency  Based on Everthon Silva Fonseca’s Ph.D. Thesis. ∗ Corresponding author. E-mail addresses: everthon@sel.eesc.usp.br (E.S. Fonseca), guido@ifsc.usp.br (R.C. Guido), scalassara@sel.eesc.usp.br (P.R. Scalassara), maciel@sel.eesc.usp.br (C.D. Maciel), pereira@sel.eesc.usp.br (J.C. Pereira). 0010-4825/$ - see front matter  2006 Published by Elsevier Ltd. doi:10.1016/j.compbiomed.2006.08.008 characteristic of pathologically affected voices in order to dis- tinguish them from the normal ones. Most of the recent computer-based algorithms for laryngeal pathology detection described in the literature are based on wavelets, fractals or neural maps and networks [9,10]. Neu- ral maps and networks cover over 95% of the existing tech- niques, some of them reaching almost 100% accuracy in the results when a good procedure is used to train the classiﬁers, but, sometimes, with a high computational order of complex- ity in relation to the signal’s length. Usually, in this last kind of classiﬁer, the voices are clusterized in respect to the follow- ing parameters: formant frequencies, pitch period and its devi- ations, stability of pitch period during vowel phonation, degree of dissimilarity of the shape of the pitch, low-to-high energy ratio (LHER), noise-to-harmonics ratio (NHR) and harmonics- to-noise ratio (HNR). Fractal-based classiﬁers have about 90% classiﬁcation accuracy, but they usually detect only some par- ticular pathologies, like Friedreich’s ataxia for example [11,12]. Best-basis wavelet classiﬁers produce about 85% of classiﬁca- tion accuracy. This work proposes an algorithm, with a low order of compu- tational complexity, to identify patients with nodules [13] in vo- cal folds. It is based on Daubechies’ discrete wavelet transform