High Frequency Noise Detection and Handling in ECG Signals Kjell Le, Trygve Eftestøl, and Kjersti Engan Department of Electrical Engineering and Computer Science University of Stavanger Email: kjell.le@uis.no Stein Ørn *† * Department of Cardiology Stavanger University Hospital Department of Electrical Engineering and Computer Science University of Stavanger Øyunn Kleiven Department of Cardiology Stavanger University Hospital Abstract—After acquisition of new clinical electrocardiogram (ECG) signals the first step is often to preprocess and have a signal quality assessment to uncover noise. There might be restriction on the signal length and other issue that impose limitation where it is not possible to discard the whole signal if noise is present. Thus there is a great need to retain as much noise free regions as possible. A noise detection method is evaluated on a manually annotated subset (2146 leads) of a data base of 12-lead ECG recordings from 1006 bicycle race participants. The aim is to apply the noise detector on the unlabelled part of the data set before any further analysis is conducted. The proposed noise detector can be divided into 3 parts: 1) Select a high frequency signal as a base signal. 2) Apply a thresholding strategy on the base signal. 3) Use a noise detection strategy. In this work receiver operating characteristic (ROC) curve and area under the curve (AUC) will be used to assess a high frequency noise detector designed for ECG signals. Even though ROC analysis is widely used to assess prediction models, it has its own limitation. However, it is a good starting point to assess discriminatory ability. To generate the ROC curve the performance evaluation is based on sample-level. That is, each sample has a label whether it is noise or not. The threshold strategy and the chosen threshold will be the varying factor to generate ROC curves. The best model has an average AUC of 0.862, which shows a good detector to discriminate noise. This threshold strategy will be used for noise detection on the unlabelled part of the data set. I. I NTRODUCTION A challenge with the handling of clinical electrocardiogram (ECG) signals is to identify noise regions that are present in parts of the signals. This is especially challenging with data sets of several thousand ECGs, which is the case in this work. Denoising, i.e. removing the noise that can be removed, and identify regions with noise that are impossible to remove must be done before any clinical interpretation can be drawn from the data material. Otherwise the noise may literally interfere with the veracity of the interpretation. The importance of the resolution of the noise detector depends on which stage, from the acquirement to assessing the ECG signal, the noise identification is done. During acquisition, the signal quality assessment could use relative large segments of for example 10 s to evaluate the quality of the signal [1]–[4], i.e. the detection is done on segment-level. In this stage there is the luxury of remeasuring possibility if the signal quality is unacceptable. However, often another assessment is done after the entire data material is collected, and it is not possible to redo the ECG recording of a specific person. In this case, the noise detection should be much more localized, so to be sure not to discard signals that can be useful in clinical interpretations. A possible approach will be to develop and evaluate a noise detector on a manually annotated subset (2146 segments) of the complete data set. Furthermore, the best performing noise detector can be applied to the unlabelled part of the data set before any further analysis is conducted. At this point it is desirable to remove noise to enhance the signal quality. Though, since there is always an overlapping in frequency bands between important information and noise, some information is expected to be lost. Recommendation on preprocessing methods to remove baseline wander, powerline interference and high frequency noise can be found in [5]. One method to identify severe noise in the high frequency area is to extract a high frequency signal from the original signal and identify abnormality in the extracted signal. Ab- normality could be the presence of large energy in the high frequency specter for a section of the signal compared to other sections. Various methods to extract the high frequency signal, not limited to these, are a highpass filter, stationary wavelet transform (SWT) [6], also known as algorithme ` a trous, and empirical mode decomposition (EMD) [2], [7]. The differences between the high frequency signals produced are relatively small when juxtaposed as shown in figure 2. In the literature noise detection in ECG is usually done on a segment-level, where segments typically have a duration of 10s [1]–[4]. For the purpose of this work the noise localization is defined at a sample-level, where each sample will be labelled as noise/not noise, permitting better exploitation of the data set. The proposed noise detection is a 3 parts dissection of a method proposed by Satija et.al. [8]. That is: 1) Extract a high frequency signal, base signal, to use as an input. 2) A threshold 2018 26th European Signal Processing Conference (EUSIPCO) ISBN 978-90-827970-1-5 © EURASIP 2018 46