Use of principal component analysis for automatic classification of epileptic EEG activities in wavelet framework U. Rajendra Acharya a , S. Vinitha Sree b,⇑ , Ang Peng Chuan Alvin a , Jasjit S. Suri c,d a Department of Electronics and Communication Engineering, Ngee Ann Polytechnic, Singapore 599489, Singapore b Global Biomedical Technologies, CA, USA c CTO, Global Biomedical Technologies, CA, USA d Biomedical Engineering Department, Idaho State University (Aff), ID, USA article info Keywords: Epilepsy Principal component analysis Eigenvalues Classification Non-linear analysis Wavelet Packet Decomposition Ictal Interictal Electroencephalogram abstract Electroencephalogram (EEG) signals are used to detect and study the characteristics of epileptic activities. Owing to the non-linear and dynamic nature of EEG signals, visual inspection and interpretation of these signals are tedious, time-consuming, error-prone, and subjected to inter-observer variabilities. Therefore, several Computer Aided Diagnostic (CAD) based studies have adopted non-linear techniques to study the normal, interictal, and ictal activities in EEGs. In this paper, we present a novel automatic technique based on data mining for epileptic activity classification. In order to compare our study with the results of rel- ative studies in the literature, we used the widely used benchmark dataset from Bonn University for eval- uation of our proposed technique. Hundred samples each in normal, interictal, and ictal categories were used. We decomposed these segments into wavelet coefficients using Wavelet Packet Decomposition (WPD), and extracted eigenvalues from the resultant wavelet coefficients using Principal Component Analysis (PCA). Significant eigenvalues, selected using the ANOVA test, were used to train and test several supervised classifiers using the 10-fold stratified cross validation technique. We obtained 99% classifica- tion accuracy using the Gaussian Mixture Model (GMM) classifier. The proposed technique is capable of classifying EEG segments with clinically acceptable accuracy using less number of features that can be extracted with less computational cost. The technique can be written as a software application that can be easily deployed at a low cost and used with almost no expert training. We foresee that this soft- ware can, in the future, evolve into an efficient adjunct tool that cannot only classify epileptic activities in EEG signals but also automatically monitor the onset of seizures and thereby aid the doctors in providing better and timely care for the patients suffering from epilepsy. Ó 2012 Elsevier Ltd. All rights reserved. 1. Introduction Epilepsy is a common neurological disorder that is character- ized by occurrence of recurrent seizures. World Health Organiza- tion statistics (WHO, 2011) indicate that every year between 40 and 70 per 100,000 people are diagnosed with epilepsy in devel- oped countries, and this figure is almost twice in the case of devel- oping countries. Generally, Electroencephalogram (EEG) signals are used to detect seizures (Thakor & Tong, 2004). Typically, the phy- sicians analyze the EEG segments for three types of activities: ictal, which is usually characterized by continuous rhythmical activity that has a sudden onset when the patient is exhibiting a seizure; interictal, which is characterized by small spikes and subclinical seizures that generally occur during the time between seizures in epileptic patients; and normal EEG segments. Characterization of EEG segments into these three classes will help the physicians in studying the underlying cause of these changes, in monitoring sei- zures, and also in administering appropriate seizure management protocols in order to improve the quality of life of epileptic patients (Osorio & Frei, 2009; Shoeb, Guttag, Pang, & Schachter, 2009). EEG signal recordings are generally long, and hence, the resulting signal to be analyzed is voluminous. Visual inspection, therefore, be- comes tedious, time-consuming, error-prone, and subjected to in- ter-observer variabilities. To address these limitations, Computer Aided Diagnostic (CAD) tools have been developed for several med- ical diagnostic applications. Many automated CAD techniques extract linear time-domain and frequency-domain based features from the EEG signal to de- tect epileptiform discharges. EEG signals are by nature non-linear (Kannathal et al., 2005a; Lehnertz, 2008; Pijn et al., 1997; Subha, Joseph, Acharya, & Lim, 2010) and seizures are characterized by non-linear transitions of an epileptic brain from its less ordered interictal state to a more ordered ictal state. Therefore, many 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2012.02.040 ⇑ Corresponding author. E-mail address: vinitha.sree@gmail.com (S. Vinitha Sree). Expert Systems with Applications 39 (2012) 9072–9078 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa