Proceedings of W orld Congress on Medical Physics and Biomedical E ngineering, Chicago, July 2000. World Congress on Medical Physics and Biomedical Engineering in 2000, Chicago, July 23-28,2000 4/4/00 Using Wavelet Coefficients for the Classification of the Electrocardiogram P. de Chazal 1 , B. G. Celler 1 and R. B. Reilly 2 Abstract - This study investigates the automatic classification of the Frank lead electrocardiogram (ECG) into different pathophysiological disease categories. Coefficients from the discrete wavelet transform are used to represent the ECG diagnostic information and a comparison of the performance of classifiers processing feature sets generated using different mother wavelets is made. Fifteen feature sets are calculated from three Daubechies wavelets, with the decomposition level varied between 3 and 7. The classification performance of each feature set was optimised using automatic feature selection and by combining classifications of multi-beat ECG information. Throughout the study a database of 500 ECG records with examples from seven disease categories was used. The classification of each record is known with 100% confidence and is based on ECG independent information. Using multiple runs of 10-fold cross-validation to obtain all results, it was shown that the overall classification performance of the different feature sets was 71.6-74.2%. In addition, the wavelet order and level had little influence on the overall performance. Analysis of the automatically chosen features reveal that time-frequency bands in the vicinity of the QRS onset and the T-wave are consistently selected. Key words – ECG classification, Wavelets, Cross-validation 1 INTRODUCTION The classification of the electrocardiogram (ECG) into different pathophysiological disease categories is a complex pattern recognition task. Computer based classification of the ECG can achieve high accuracy and offers the potential of affordable mass screening for cardiac abnormalities. Successful classification is achieved by finding characteristic shapes of the ECG that discriminate effectively between the required diagnostic categories. Conventionally, a typical heart beat is identified from the ECG and the component waves of the QRS, T and possibly P waves are characterised using measurements such as magnitude, duration and area. Classification is then achieved on the basis of these measurements. Measurements based on QRS, T and P sections vary significantly even among normals and can lead to misclassification. Wavelet analysis of a signal consists of breaking up a signal into shifted and scaled versions of a reference (mother) wavelet. A wavelet is a signal of effectively limited duration that has an average value of zero. In determining the wavelet (decomposition) coefficients of a signal, the correlation of the mother wavelet at different shifts and scales with the signal is computed. Hence, the wavelet coefficients represent measures of similarity of the local shape of the signal to the mother wavelet under different shifts and scales. We utilise this property in this study by using the wavelet coefficients to describe the ECG shape. Classification is performed directly on the wavelet coefficients. There is no intuitive way to know which mother wavelet to choose or what level decomposition to use. In this study we consider a range of Daubechies wavelets and decomposition levels. A database of modest size was employed hence a cross- validation scheme was used to estimate the performance of the different feature sets. 2 METHODS In this study the Frank lead ECG [1] has been used. The Frank lead ECG record presents three ECG signals that project the electrical field generated by the muscular tissue of the heart onto the mutually orthogonal sagittal, frontal and transverse planes. The Frank system attempts to compensate for distortions of the electric field introduced by the irregular shape of the human torso. In practice the Frank lead signals are approximate orthogonal views due to the wide variation of shapes and tissue content of human torsos. Figure 1 shows the data processing steps used in this study. 2.1 ECG PRE-PROCESSING. The ECG is sampled at 500 Hz then filtered with a 0.5 - 40 Hz linear phase digital bandpass filter to remove unwanted baseline drift and powerline interference. All QRS complexes are detected and data windows containing the P-QRS-T complexes are isolated for each beat using the ECG samples in the range 200ms before the R-wave maximum points to 400ms after the R-wave maximas. The isopotential value is subtracted, and the data window multiplied with a Hanning