Knowledge-Based Systems 106 (2016) 38–50 Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys Automatic signal abnormality detection using time-frequency features and machine learning: A newborn EEG seizure case study Boualem Boashash ∗ , Samir Ouelha Qatar University, Department of Electrical Engineering, Doha, Qatar a r t i c l e i n f o Article history: Received 26 January 2016 Revised 5 May 2016 Accepted 13 May 2016 Available online 17 May 2016 Keywords: Newborn EEG seizure Time-frequency analysis Feature extraction Feature selection High-resolution TFD Clinical decision making a b s t r a c t Time-frequency (TF) based machine learning methodologies can improve the design of classiﬁcation sys- tems for non-stationary signals. Using selected TF distributions (TFDs), TF feature extraction is performed on multi-channel recordings using channel fusion and feature fusion approaches. Following the ﬁndings of previous studies, a TF feature set is deﬁned to include three complementary categories: signal related features, statistical features and image features. Multi-class strategies are then used to improve the classi- ﬁcation algorithm robustness to artifacts. The optimal subset of TF features is selected using the wrapper method with sequential forward feature selection (SFFS). In addition, a new proposed measure for TF fea- ture selection is shown to improve the sensitivity of the classiﬁer (while slightly reducing total accuracy and speciﬁcity). As an illustration, the TF approach is applied to the design of a system for detection of seizure activity in real newborn EEG signals. Experimental results indicate that: (1) The compact kernel distribution (CKD) outperforms other TFDs in classiﬁcation accuracy; (2) a feature fusion strategy gives better classiﬁcation than a channel fusion strategy; e.g. using all TF features, the CKD achieves a clas- siﬁcation accuracy of 82% with feature fusion, which is 4% more than the channel fusion approach; (3) the SFFS wrapper feature selection method improves the classiﬁcation performance for all TFDs; e.g. total accuracy is improved by 4.6%; (4) the multi-class strategy improves the seizure detection accuracy in the presence of artifacts; e.g. a total accuracy of 86.61% with one vs. one multi-class approach is achieved i.e. 0.91% more than the binary classiﬁcation approach. The results obtained on a large practical real data set conﬁrm the improved performance capability of TF features for knowledge based systems. © 2016 Elsevier B.V. All rights reserved. 1. Introduction This study is intended to be applicable to all types of non- stationary signals regardless of their nature or origin, but without loss of generality we will consider EEG signals for illustration purposes. The EEG is a well-known non-invasive test used in a wide range of applications such as epilepsy studies. It consists of several electrodes that are placed on a patient’s scalp to record electrical activity from the brain. These EEG signals, like most real signals, have been shown to possess non-stationary characteristics [1]. But the two classical signal representations i.e. time-domain representation and frequency-domain representation, in both cases, treat the signal as stationary, which is a rough simpliﬁca- tion. These conventional representations (in time or frequency) have been shown to be inadequate for non-stationary signals, and instead joint time-frequency (t, f) domain representations were ∗ Corresponding author. Fax: +974 44034201. E-mail addresses: boualem.boashash@gmail.com, boualem@qu.edu.qa (B. Boashash), samir_ouelha@hotmail.fr (S. Ouelha). found to be better adapted to process such signals. In particular, there are features that represent subtle change which may not be visible in the time domain or frequency domain, but are clearly visible in the joint time-frequency domain (see Appendix A for two illustrative examples). Recent studies have also found that time-frequency (TF) signal classiﬁcation using such (t, f) domain features can outperform conventional time-only or frequency-only signal classiﬁcation approaches as they allow more discriminative information to be extracted from the signal [1]. Fig. 1 illustrates the TF feature extraction methodologies and approaches that form the basis of this study. There are two basic TF approaches to signal classiﬁcation [1,2]. (1) Visual analysis for manual classiﬁcation [3]: for this ap- proach to be effective, it is important to select a TFD that offers high resolution to avoid blurring or mixing up unrelated compo- nents [1]. (2) Automated classiﬁcation using template matching or ma- chine learning approach: to detect abnormal changes in a signal as soon as it occurs without human intervention, an automated implementation is necessary. For a TF approach, one can use: (a) http://dx.doi.org/10.1016/j.knosys.2016.05.027 0950-7051/© 2016 Elsevier B.V. All rights reserved.