GA-based Feature Subset Selection for Myoelectric Classification Mohammadreza Asghari Oskoei Huosheng Hu Department of Computer Science Department of Computer Science University of Essex University of Essex Wivenhoe Park, Colchester,CO4 3SQ , UK Wivenhoe Park, Colchester,CO4 3SQ , UK masgha@essex.ac.uk hhu@essex.ac.uk Abstract – This paper presents an ongoing investigation to select optimal subset of features from set of well-known myoelectric signals (MES) features in time and frequency domains. Four channel of myoelectric signal from upper limb muscles are used in this paper to classify six distinctive activities. Cascaded genetic algorithm (GA) has been adopted as the search strategy in feature subset selection. Davies–Bouldin index (DBI) and Fishers linear discriminant index (FLDI) are employed as the filter objective functions and linear discriminant analysis (LDA) has been used as the wrapper objective function. Results prove more accurate and reliable classification for the elite subset of features applying to artificial neural networks as the classifier. Index Terms – Feature Subset Selection, EMG / Myoelectric signal classification, Genetic Algorithm, Class Separability index. I. INTRODUCTION Myoelectric signals (MES) contain rich information that can be used as a human-machine interface to manipulate assistive devices and robots based on user’s intention. However, most of the current myoelectric control systems suffer from low accuracy and instability in multi-function controls. It is extremely challenging to interpret MES data and classify its features accurately and reliably to control more than one or two functions. Features have key role in MES classification. They represent signals to the classifier, and selecting optimal features is the key point in MES classification. In general, there are two distinct approaches to provide efficient features for the classifiers, namely feature extraction (projection) and feature selection. Feature extraction creates a subset of new features by combination of the existing feature based on linear or nonlinear mapping, but feature selection chooses a subset of all features by search in feasible spaces. Englehart et al. [1], [4] show that for time-scale features, feature projection using principle components analysis (PCA) provides far more effective means of dimensionality reduction than feature selection by class separability (CS). They demonstrate wavelets transform (WT) and wavelet pocket transform (WPT) outperform time domain (TD) features when using PCA/LDA combination as the dimensionality reduction and classification means. Chu et al. [12] propose a linear- nonlinear feature projection method composed of PCA and self-organizing feature map (SOFM) that performs both the dimensionality reduction and nonlinear mapping. This method overcomes a defect of PCA that the density functions of classes are not exactly discriminated, but it burdens huge computation in training process consisting of computing of local discriminant basis (LDB) for WPT, eigenvectors for PCA, weight vectors of SOFM and weight vectors of MLP neural network [19]. Zardoshti et al. [15] evaluate MES features using Davies- Bouldin index and K-nearest neighbour nonparametric classifier. The features evaluated are the integral of average value, the variance, the number of zero crossings, the Willison amplitude, the v-order and log detectors, and autoregressive model parameters. A new feature, MES Histogram, is introduced and shown to be most effective. Park et al. [13] evaluate a set of MES features by comparing separability measure provided by the Bhattacharyya distance. They show adaptive cespstrum vector (ACV) is more feasible feature for MES pattern classification. Chan et al. [14] during developing Fuzzy classifier for MES, have found out that the slope sign changes (SSC) which was introduced as the TD feature by Hudgins [5], not only improve the classification performance but also even deteriorates it for some subjects. The classification process of time-scale features is computational expensive. Englehart and Hudgins’ colleagues in their recent works [2][3][16] have preferred TD or FD features over time-scale features [19]. Moreover, [17] had shown that MES can be assumed stationary for the short time contractions for real-time controls. By excluding time-scale features [1][4], feature projection could not be the best choice for dimensionality reduction in MES. Feature subset selection (FSS) not only reduces computation cost by dimensionality reduction, but also improves the generalization capabilities by turning to fewer parameters in pattern recognition. In addition, it is evident that all muscles have no similar role for each activity, and subset selection can be extended to channel (muscle) selection. Researches show that optimum subset of features and channels (muscles) could vary depending on subjects, type of motions and classifiers. It becomes more interesting when it’s realized that some features could be more effective with some certain channels while they are not by other channels. In other words, a subset of feature-channel could be selected before off-line training to achieve as possible as high accuracy in classification. Fig. 1 shows a flowchart of MES classification. The raw data collected from surface of user’s muscles during activities is segmented and the features are then extracted. Features are the most challenging point in pattern recognition problems, because they should be adequately consistent with the classifiers. Meanwhile, due to time constraint in real-time control, most distinctive features should be selected to feed to the classifier, namely linear discriminate analysis (LDA) and artificial neural networks (ANN). 1-4244-0571-8/06/$20.00 ©2006 IEEE 1465 Proceedings of the 2006 IEEE International Conference on Robotics and Biomimetics December 17 - 20, 2006, Kunming, China