Expert Systems With Applications 114 (2018) 54–64 Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa ECG classification using three-level fusion of different feature descriptors Zahra Golrizkhatami , Adnan Acan Computer Engineering Department, Faculty of Engineering, Eastern Mediterranean University, Famagusta, via Mersin 10, Turkey a r t i c l e i n f o Article history: Received 16 April 2018 Revised 26 June 2018 Accepted 12 July 2018 Keywords: Electrocardiogram Morphological feature Statistical feature Ttemporal features Multi-stage CNN-based features Convolutional neural networks Feature-level fusion Score-level fusion Decision-level fusion a b s t r a c t Fusion of feature descriptors extracted from a signal through different methods is an important issue for the exploitation of representational power of each descriptor. In this research work, a novel system which exploits multi-stage features from a trained convolutional neural network (CNN) and precisely combines these features with a selection of handcrafted features is proposed. The set of handcrafted features con- sists of three subsets namely, wavelet transform based morphological features representing localized sig- nal behaviour, statistical features exhibiting overall variational characteristics of the signal and tempo- ral features representing the signal’s behaviour on the time axis. The proposed system utilizes a novel decision-level fusion of features for ECG classification by three different approaches; the first one uses normalized feature-level fusion of handcrafted global statistical and local temporal features by uniting these features into one set, the second one uses the morphological feature subset, and the third one combines features extracted from multiple layers of a CNN through using a score-level based refinement procedure. The main impact of the proposed approach is the score-level based fusion of automatically learned features extracted from multiple layers of trained CNN and the decision-level fusion of features characterising the signal in totally different representational spaces. The individual decisions of the three different classifiers are fused together based on the majority voting and a unified decision is reached for the input ECG signal classification. The results over the MIT-BIH arrhythmia benchmarks database exhibited that the proposed system achieves a superior classification accuracy compared to all of the state-of-the-art ECG classification methods. © 2018 Elsevier Ltd. All rights reserved. 1. Introduction Heart’s electrical activities are captured by using some elec- trodes that are connected to specific points of patient’s chest. Elec- trocardiogram (ECG) classification is one of the most challenging tasks in heartbeat analysis. Medical centres are using ECGs in order to detect various cardiovascular diseases. By monitoring a patient’s ECG tape; expert cardiologists are able to recognize various cardiac of arrhythmia which can be the cause of several serious heart dis- eases. In the last decade, researchers have proposed different pat- tern recognition systems in order to detect such arrhythmias auto- matically, which have been very helpful for cardiologists and clin- icians in hospitals. Although collecting the ECG data is easy, chal- lenges on extracting the most useful information from the ECG sig- nals still exist. Also, due to limited accuracy of visual and manual interpretation of ECGs, researchers proposed the use of computer- Corresponding author. E-mail addresses: Zahra.golrizkhatami@emu.edu.tr (Z. Golrizkhatami), Adnan.acan@emu.edu.tr (A. Acan). aided diagnosis (CAD) systems for the analysis and interpretation of these signals automatically. In literature, the success of score-level fusion in classification accuracy has already been demonstrated experimentally with ap- plications in multimodal biometric recognition systems (He et al., 2010; Taheri & Toygar, 2018). In these implementations, the simi- larity between the test and training feature vectors is referred to the match score and it is shown that this score contains more in- formation about the test sample compared to its raw data or its own feature vectors. Score-level fusion is basically a distance-based classifier that determines the class label of a test sample through computing the distances between its feature vector and those of training samples. On the other hand, feature-level fusion can be implemented by concatenating different normalized feature vec- tors. The curse of dimensionality is the main drawback of feature- level fusion. Compared to feature-level fusion, score-level fusion is easier to implement and computationally more efficient to deal with large sets of feature vectors. Deep Learning and especially CNN is one of the best choices in many well-known artificial intelligence (AI) applications such as speech recognition, signal and image processing and natural lan- https://doi.org/10.1016/j.eswa.2018.07.030 0957-4174/© 2018 Elsevier Ltd. All rights reserved.