Voice Recognition Algorithm for Portable Assistive Devices Hossein Ghaffari Nik, Gregory M. Gutt, Nathalia Peixoto Neural Engineering Lab, Electrical and Computer Engineering, George Mason University Fairfax, Virginia hghaffar@gmu.edu, npeixoto@gmu.edu Abstract — We present here the implementation of a robust voice recognition algorithm for voice activated control of assistive devices. We implemented an effective method based on cross correlation of Mel Frequency Cepstral Coefficients (MFCC). The developed method yields high accuracy in low noise environment. Because our implementation is based on a set of training samples for each command, it can be easily adapted for any user. Once the training set is loaded, every command is compared to the MFCCs of all samples in the training set. We then use a “winner-takes-all” method to decide which group the command belongs to. I. INTRODUCTION The ultimate goal of this project is to develop a straightforward and effective method for voice recognition which can be easily integrated to the joystick of a powered wheelchair and enable voice control for quadriplegic and disabled individuals. In order to completely control Voice Controlled Wheelchairs (VCWs), only some isolated words need to be recognized (i.e. go, stop, right, left, and backward). However, because the control words may be embedded in speech, it is of paramount importance to implement safety mechanisms during the control. While there are several methods of speech recognition such as the Hidden Markov Models (HMMs) [9], Multiple Vector Quantization of HMMs (MVQHMMs) [8] or complex neural network-based voice recognition [3], and many prototypes of VCWs have been suggested during the past decade [7], none seems simple enough and robust to be implemented and developed for disabled individuals [6],[10]. The idea and work done for this project was originally designed, implemented, and tested on a robotic arm built with the Lego Mindstorms™ NXT (see Fig. 1). In this paper we report on results obtained with this system. This robot is capable of drawing a circle, a square, or a triangle upon command. It is controlled via USB or Bluetooth connection to a PC; all programming is developed in Matlab™ 7.1 (Mathworks, Natick, MA). The developed voice recognition algorithm yields high accuracy in recognizing the words “triangle”, “circle” and “square” with low ambient noise. II. METHODS Here we present a method based on cross correlation of mel frequency cepstral coefficients (MFCCs) for speech recognition of isolated words [1],[5]. We developed the system and implemented it on three fronts: (1) the robot, (2) computer/robot interface, and (3) voice recognition program. A set of 15 training samples of 2 second recording each was collected in Matlab™ for each command: “circle”, “square”, and “triangle”. The MFCCs of the recorded training sets were calculated and stored in the memory for later compression. For recognizing the given command the MFCCs of the spoken word is calculated and its coefficients are cross correlated with the ones stored in the memory from the training samples. After comparison of the given Figure 1. The Voice Activated Robotic Arm capable of three-dimensional movement, developed using Lego Mindstorms NXT and controlled using Matlab™ . Notice the microphone on the side (lower left), and the self- correcting mechanism for controlling the pressure of the stylus on the magnetic pad. The block with the four buttons contains a 32-bit ARM7 microcontroller. The potentiometers on the motors were secured with the cable ties. Scale: 1cm. 997 1-4244-1262-5/07/$25.00 ©2007 IEEE IEEE SENSORS 2007 Conference Authorized licensed use limited to: George Mason University. Downloaded on October 24, 2009 at 18:19 from IEEE Xplore. Restrictions apply.