IJBSCHS(2009-14-2-8) Biomedical Soft Computing and Human Sciences, Vol.14,No.2,pp.55-60 (2009)
Special Issue: Bio-sensors: Data Acquisition, Processing and Control
[Original article] Copyright©1995 Biomedical Fuzzy Systems Association
(Accepted on 2009.01.01)
55
Diagnosis of Voice Disorders using Mel Scaled WPT and Functional
Link Neural Network
Paulraj M P, Sazali Yaacob, and M. Hariharan
School of Mechatronics Engineering, Universiti Malaysia Perlis, Perlis, Malaysia
The paper was received on March 31, 2008.
Abstract: Nowadays voice disorders are increasing dramatically due to the modern way of life. Most of
the voice disorders cause changes in the voice signal. Acoustic analysis on the speech signal could be a
useful tool for diagnosing voice disorders. This paper applies Mel-scaled wavelet packet transform
(Mel-scaled WPT) based features to perform accurate diagnosis of voice disorders. A Functional Link
Neural Network (FLNN) is developed to test the usefulness of the suggested features. Two simple modifica-
tions are newly proposed in the FLNN architecture to improve the classification accuracy. In the first ar-
chitecture, a hidden layer is newly introduced in a FLNN and trained by Back Propagation (BP) proce-
dure. In the second architecture, the Integral and Derivative controller concepts are introduced to the
neurons in the hidden layer and the network is trained by BP procedure. The performance is compared
with conventional neural network model. The results prove that the proposed FLNN gives very promising
classification accuracy and suggested features can be employed clinically to diagnose the voice disorders.
Keywords Acoustic Analysis, Voice Disorders, Mel scaled Wavelet Packet Transform (Mel
scaled WPT), Functional Link Neural Network(FLNN)
1. Introduction
Voice is a highly multi-variate component of
speech and the need to provide for its quantitative
description has led to the development of clinical
tools. With the rapid development of signal
processing technique, vocal or voice signal can be
used for the detection of voice disorders. Vocal or
voice signal information plays an important role to
understand the process of vocal fold pathology for-
mation. In the last years, lot of works have been car-
ried out on the automatic detection and classification
of voice pathologies by means of acoustic analysis,
parametric and non-parametric feature extraction,
automatic pattern recognition or statistical methods
[1-4]. The feature extraction using the above methods
includes handling of overlapping windows of a
speech signal and large number of computations per-
formed during the feature extraction phase. In the
recent years, wavelet transform has been used to ana-
lyze all kinds of problems in signal and image
processing. Speech processing is one of these areas.
Speech is a highly non-stationary signal; Fourier
transform is not a very useful tool for analyzing
non-stationary signals as the time domain information
are lost while performing the frequency transforma-
tion. When looking at a Fourier transform of a signal,
it is impossible to tell when a particular event took
02600, Jejawi, Arau,Kangar,
Perlis, Malaysia.
Phone No. 006049798918 Fax No. 006049798142
Email: paul@unimap.edu.my
place. Wavelet transform approach is a good tool for
the analysis of non-stationary signals, as it is useful in
localizing a transform approach is a good tool for the
analysis symptom both in time and frequency scale.
Hence, wavelet analysis has the potential for the iden-
tification of voice disorders. In [5], authors have pre-
sented a procedure to identify pathological disorders
of larynx using wavelet analysis. P.S Bhat et al. [6]
have proposed a method for the classification and
analysis of speech abnormalities based on wavelet
analysis and artificial neural network.
S. Datta and his co-workers have developed a
new filter structure using Mel-like Admissible Wave-
let Packet Structure for speech recognition [7-9].
These filters have the advantage of having frequency
bands spacing similar to the Mel scale. Wavelet Pack-
et has the advantage that it can segment the frequency
axis and has uniform translation in time. This proper-
ty of partitioning of the frequency axis is used for the
realization of conjugate mirror filter structure similar
to that of Mel filter.
The aim of this paper is to apply Mel-scaled
wavelet packet filter for extracting the features from
the voice or vocal signal. A Functional Link Neural
Network is developed to test the efficacy of the fea-
ture vector derived by using Mel-scaled Wavelet
packet filter. The experimental results indicate that
FLNN gives very promising classification accuracy
and the proposed features can be used as an additional
acoustic indicator for the diagnosis of voice disorders.