Voice Recognition Algorithm for Portable Assistive
Devices
Hossein Ghaffari Nik, Gregory M. Gutt, Nathalia Peixoto
Neural Engineering Lab, Electrical and Computer Engineering, George Mason University
Fairfax, Virginia
hghaffar@gmu.edu, npeixoto@gmu.edu
Abstract — We present here the implementation of a robust
voice recognition algorithm for voice activated control of
assistive devices. We implemented an effective method based
on cross correlation of Mel Frequency Cepstral Coefficients
(MFCC). The developed method yields high accuracy in low
noise environment. Because our implementation is based on a
set of training samples for each command, it can be easily
adapted for any user. Once the training set is loaded, every
command is compared to the MFCCs of all samples in the
training set. We then use a “winner-takes-all” method to
decide which group the command belongs to.
I. INTRODUCTION
The ultimate goal of this project is to develop a
straightforward and effective method for voice recognition
which can be easily integrated to the joystick of a powered
wheelchair and enable voice control for quadriplegic and
disabled individuals. In order to completely control Voice
Controlled Wheelchairs (VCWs), only some isolated words
need to be recognized (i.e. go, stop, right, left, and
backward). However, because the control words may be
embedded in speech, it is of paramount importance to
implement safety mechanisms during the control. While
there are several methods of speech recognition such as the
Hidden Markov Models (HMMs) [9], Multiple Vector
Quantization of HMMs (MVQHMMs) [8] or complex neural
network-based voice recognition [3], and many prototypes of
VCWs have been suggested during the past decade [7], none
seems simple enough and robust to be implemented and
developed for disabled individuals [6],[10].
The idea and work done for this project was originally
designed, implemented, and tested on a robotic arm built
with the Lego Mindstorms™ NXT (see Fig. 1). In this paper
we report on results obtained with this system. This robot is
capable of drawing a circle, a square, or a triangle upon
command. It is controlled via USB or Bluetooth connection
to a PC; all programming is developed in Matlab™ 7.1
(Mathworks, Natick, MA). The developed voice recognition
algorithm yields high accuracy in recognizing the words
“triangle”, “circle” and “square” with low ambient noise.
II. METHODS
Here we present a method based on cross correlation of
mel frequency cepstral coefficients (MFCCs) for speech
recognition of isolated words [1],[5]. We developed the
system and implemented it on three fronts: (1) the robot, (2)
computer/robot interface, and (3) voice recognition program.
A set of 15 training samples of 2 second recording each was
collected in Matlab™ for each command: “circle”, “square”,
and “triangle”. The MFCCs of the recorded training sets
were calculated and stored in the memory for later
compression. For recognizing the given command the
MFCCs of the spoken word is calculated and its coefficients
are cross correlated with the ones stored in the memory from
the training samples. After comparison of the given
Figure 1. The Voice Activated Robotic Arm capable of three-dimensional
movement, developed using Lego Mindstorms NXT and controlled using
Matlab™ . Notice the microphone on the side (lower left), and the self-
correcting mechanism for controlling the pressure of the stylus on the
magnetic pad. The block with the four buttons contains a 32-bit ARM7
microcontroller. The potentiometers on the motors were secured with the
cable ties. Scale: 1cm.
997 1-4244-1262-5/07/$25.00 ©2007 IEEE IEEE SENSORS 2007 Conference
Authorized licensed use limited to: George Mason University. Downloaded on October 24, 2009 at 18:19 from IEEE Xplore. Restrictions apply.