IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 4, APRIL 1997 829 Nonlinear Considerations in EEG Signal Classification Neep Hazarika, Ah Chung Tsoi, Senior Member, IEEE, and Alex A. Sergejew Abstract— In this paper, we investigate the effect of incor- porating modeling of nonlinearity on the classification of elec- troencephalogram (EEG) signals using an artificial neural net- work (ANN). It is observed that the ANN’s predictive ability is improved after preprocessing EEG signals using a particular nonlinear modeling technique, viz. a bilinear model, compared with those obtained by using a particular classical linear analysis method, viz. an autoregressive (AR) model. Until recently, linear time-invariant Gaussian modeling has dominated the develop- ment of time series modeling and feature extraction. The advan- tage of such classical models lies in the fact that a complete signal processing theory is available. In the case of EEG signals, where the underlying theory regarding the dynamical law governing the generation of these signals (e.g., the underlying physiological factors) is not completely understood, a case can be made for using improved signal processing models that are not subject to linear constraints. Such models should recognize important features of the observed data that may not be well modeled by a linear time-invariant model. It is known that EEG signals are nonstationary, and it is possible that they may be nonlinear as well. Thus, one way of gaining further insights on the structure of EEG signals is to introduce nonlinear models and higher order spectra. This paper compares the results of classification using a linear AR model with those obtained from a bilinear model. It is shown that in certain cases, the nonlinearity of EEG signals is an important factor that ought to be taken into consideration during preprocessing of the signals prior to the classification task. I. INTRODUCTION A N IMPORTANT objective of time series modeling is to obtain information on the dynamical law that governs its generation. The constructed model should be able to recognize and capture important features of the observed data. The theory of linear time series modeling has been well developed over the years [1], and one can obtain a linear time-invariant model to fit a given set of data with reasonable computational resources, e.g., a personal computer. Both the time domain and the frequency domain approaches have been used in linear time series analysis [1], [2]. We have previously described, in a preliminary study, the use of the autoregressive (AR) Manuscript received January 10, 1997. This work was supported by a grant from the National Health and Medical Research Council and in part by the Australian Research Council. The associate editor coordinating the review of this paper and approving it for publication was Prof. Georgios B. Giannakis. N. Hazarika is with the Department of Computer Science and Applied Mathematics, Aston University, Birmingham, U.K. A. C. Tsoi is with the Faculty of Informatics, University of Wollongong, Wollongong, NSW, Australia. A. Sergejew is with the Centre for Applied Neurosciences and School of Biophysical Sciences and Electrical Engineering, Swinburne University of Technology, Hawthorn, Victoria, Australia. Publisher Item Identifier S 1053-587X(97)02566-X. model as a feature extraction technique for the classification of electroencephalogram (EEG) signals using artificial neural network techniques [3], [4]. It was found that the multilayer perceptron could, with some success, classify EEG signals obtained from normal subjects, those suffering from severe schizophrenic disorder, and those with obsessive compulsive disorder (OCD). In the preliminary study, we attempted to use the following signal features in association with a multilayer perceptron as a classification method: AR coefficients, AR spectral estimates, Fourier coefficients, Fourier spectra, and raw time series EEG data. The classification accuracy was best using the AR coefficients of the EEG signals, whereas classification was not as successful when attempted using AR spectral estimates, Fourier coefficients, Fourier spectra, or the raw time series EEG data [3], [4]. It is difficult to conjecture the reasons why only features based on the AR coefficients produced satisfactory results, while features based on the other measures did not perform satisfactorily. Conceptually, all the measures are premised on the same principles. One reason may be the fact that measures based on raw time series and those based on spectra have a larger number of input features than those using AR coefficients. A larger number of inputs requires a larger multilayer perceptron so that the resulting system can be used to discriminate the testing data, and it is relatively more difficult to train a large multilayer perceptron than a smaller one. Specifically, as will be detailed later, in our studies, the EEG time series are segmented into 128 point segments or frames. Classification based on raw data uses 128 points or “features,” whereas classification based on AR coefficients uses only eight coefficients to represent the same segment. The relative difficulties of classifying larger numbers of features may be one of the reasons why classification based on the raw time series or spectra was not as successful as that based on AR coefficients. The underlying theory behind the generation of EEG signals is far from being completely elucidated. The reader is referred to [5] for an exposition of the biophysics of the EEG and the enormous complexities involved. There is some evidence that EEG signals cannot be adequately represented by linear models [6]. When the Fast Fourier transform is applied to successive segments of an EEG signal, the frequency spectrum is observed to vary over time as the Fourier coefficients vary [7]. One way to interpret these variations is to hypothesise that the EEG is a nonstationary signal. An alternative interpretation 1053–587X/97$10.00 1997 IEEE