IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 4, APRIL 1997 829 Nonlinear Considerations in EEG Signal Classiﬁcation Neep Hazarika, Ah Chung Tsoi, Senior Member, IEEE, and Alex A. Sergejew Abstract— In this paper, we investigate the effect of incor- porating modeling of nonlinearity on the classiﬁcation of elec- troencephalogram (EEG) signals using an artiﬁcial neural net- work (ANN). It is observed that the ANN’s predictive ability is improved after preprocessing EEG signals using a particular nonlinear modeling technique, viz. a bilinear model, compared with those obtained by using a particular classical linear analysis method, viz. an autoregressive (AR) model. Until recently, linear time-invariant Gaussian modeling has dominated the develop- ment of time series modeling and feature extraction. The advan- tage of such classical models lies in the fact that a complete signal processing theory is available. In the case of EEG signals, where the underlying theory regarding the dynamical law governing the generation of these signals (e.g., the underlying physiological factors) is not completely understood, a case can be made for using improved signal processing models that are not subject to linear constraints. Such models should recognize important features of the observed data that may not be well modeled by a linear time-invariant model. It is known that EEG signals are nonstationary, and it is possible that they may be nonlinear as well. Thus, one way of gaining further insights on the structure of EEG signals is to introduce nonlinear models and higher order spectra. This paper compares the results of classiﬁcation using a linear AR model with those obtained from a bilinear model. It is shown that in certain cases, the nonlinearity of EEG signals is an important factor that ought to be taken into consideration during preprocessing of the signals prior to the classiﬁcation task. I. INTRODUCTION A N IMPORTANT objective of time series modeling is to obtain information on the dynamical law that governs its generation. The constructed model should be able to recognize and capture important features of the observed data. The theory of linear time series modeling has been well developed over the years [1], and one can obtain a linear time-invariant model to ﬁt a given set of data with reasonable computational resources, e.g., a personal computer. Both the time domain and the frequency domain approaches have been used in linear time series analysis [1], [2]. We have previously described, in a preliminary study, the use of the autoregressive (AR) Manuscript received January 10, 1997. This work was supported by a grant from the National Health and Medical Research Council and in part by the Australian Research Council. The associate editor coordinating the review of this paper and approving it for publication was Prof. Georgios B. Giannakis. N. Hazarika is with the Department of Computer Science and Applied Mathematics, Aston University, Birmingham, U.K. A. C. Tsoi is with the Faculty of Informatics, University of Wollongong, Wollongong, NSW, Australia. A. Sergejew is with the Centre for Applied Neurosciences and School of Biophysical Sciences and Electrical Engineering, Swinburne University of Technology, Hawthorn, Victoria, Australia. Publisher Item Identiﬁer S 1053-587X(97)02566-X. model as a feature extraction technique for the classiﬁcation of electroencephalogram (EEG) signals using artiﬁcial neural network techniques [3], [4]. It was found that the multilayer perceptron could, with some success, classify EEG signals obtained from normal subjects, those suffering from severe schizophrenic disorder, and those with obsessive compulsive disorder (OCD). In the preliminary study, we attempted to use the following signal features in association with a multilayer perceptron as a classiﬁcation method: AR coefﬁcients, AR spectral estimates, Fourier coefﬁcients, Fourier spectra, and raw time series EEG data. The classiﬁcation accuracy was best using the AR coefﬁcients of the EEG signals, whereas classiﬁcation was not as successful when attempted using AR spectral estimates, Fourier coefﬁcients, Fourier spectra, or the raw time series EEG data [3], [4]. It is difﬁcult to conjecture the reasons why only features based on the AR coefﬁcients produced satisfactory results, while features based on the other measures did not perform satisfactorily. Conceptually, all the measures are premised on the same principles. One reason may be the fact that measures based on raw time series and those based on spectra have a larger number of input features than those using AR coefﬁcients. A larger number of inputs requires a larger multilayer perceptron so that the resulting system can be used to discriminate the testing data, and it is relatively more difﬁcult to train a large multilayer perceptron than a smaller one. Speciﬁcally, as will be detailed later, in our studies, the EEG time series are segmented into 128 point segments or frames. Classiﬁcation based on raw data uses 128 points or “features,” whereas classiﬁcation based on AR coefﬁcients uses only eight coefﬁcients to represent the same segment. The relative difﬁculties of classifying larger numbers of features may be one of the reasons why classiﬁcation based on the raw time series or spectra was not as successful as that based on AR coefﬁcients. The underlying theory behind the generation of EEG signals is far from being completely elucidated. The reader is referred to [5] for an exposition of the biophysics of the EEG and the enormous complexities involved. There is some evidence that EEG signals cannot be adequately represented by linear models [6]. When the Fast Fourier transform is applied to successive segments of an EEG signal, the frequency spectrum is observed to vary over time as the Fourier coefﬁcients vary [7]. One way to interpret these variations is to hypothesise that the EEG is a nonstationary signal. An alternative interpretation 1053–587X/97$10.00  1997 IEEE