EFFICIENT IMPLEMENTATION OF ROBUST CUSUM ALGORITHM TO CHARACTERIZE NANOGAPS MEASUREMENTS WITH HEAVY-TAILED NOISE Javier Kipen , Joakim Jald´ en , Shyamprasad N. Raja , Saumey Jain Division of Information Science and Engineering (ISE), KTH, Stockholm, Sweden {kipen,jalden}@kth.se Division of Micro and Nanosystems (MST), KTH, Stockholm, Sweden {shnr,saumey}@kth.se ABSTRACT Detection of bio-molecules through quantum tunneling currents could lead to the next-generation DNA sequencing methods. In order to analyze the stability of these sensitive devices, it is nec- essary to characterize their conductance switching statistics. This characterization can be realized by denoising the tunneling current signal and clustering the outcomes. The first step can be done with the CUSUM algorithm, which detects abrupt changes and has been used in similar devices. We found heavy-tailed non-Gaussian noise in the measurement setup of the experimental devices. This pa- per suggests an approximation in the likelihood ratio step of the CUSUM algorithm that is more robust than the simple Gaussian noise assumption and, at the same time, is computationally more efficient than computing the fitted true likelihoods. Index TermsCUSUM, robust, nanogaps, heavy-tailed 1. INTRODUCTION Genome sequencing can decode the trove of information stored in DNA and other nucleic acids. This information holds the potential for developing targeted and personalized drugs for therapies. Some recent technologies could significantly reduce the price of this pro- cess. A good example was the success of the Oxford Nanopore Tech- nologies DNA sequencers, which relies on modulated ionic currents. It is believed that devices based on quantum mechanical phenomena could further reduce costs and improve accuracy [1]. In this context, we consider the development of a next-generation bio-molecular sensor based on quantum tunnel current measure- ments. The sensor consists of a pair of electrodes separated by a nanometer-sized gap enclosed by a microfluidic channel. Applying a voltage bias over the electrodes induces a tunneling current that is modulated by bio-molecules that pass the gap. Such sensors were massively fabricated in parallel in our project by first generating nano-wires with a method similar to the one described in [2] and then generating a gap through electromigration [3] by applying a current with the devices immersed in a chosen medium. The last procedure has considerably different outcomes for each medium in which the electromigration was performed. Some medi- ums showed a higher yield of tunneling devices. We analyzed the current stability of the devices for fixed voltage biases to compare the devices. When the medium was nitrogen, half of the working devices had a random telegraph signal (RTS) noise [4], which can be due to the rearrangement of atoms. We refer to these devices as switching devices. The proportion of devices with RTS noise for other mediums was similar. The measures signal can be modeled as a piece-wise constant signal with clustered levels immersed in noise. In order to analyze the stability of these devices, the current levels are associated with conductance states. Then by analyzing the spread, duration, and val- ues of these conductance levels, it can be determined whether these devices are stable for bio-molecule measurements. The CUSUM algorithm [5] was used to denoise the underlying piece-wise constant signal, assuming that the noise was Gaussian. However, the noise of the devices was heavy-tailed. The main con- tribution of this paper is the development of a robust and computa- tionally efficient version of the CUSUM algorithm for this noise. 2. METHODS In the denoising step, we tested several approaches from [6], but since the amount of data is large (recordings of minutes for hundreds of devices at 200 kHz), the computation time was excessive, which made the analysis unfeasible. A previous study about the automatic processing of translocations through nanopores [7] used the CUSUM algorithm to denoise the underlying piece-wise constant signal. This algorithm is computationally cheaper than the methods mentioned in [6], which makes it more suitable for the given measurements. Fig. 1: Distribution fit on real data. In our measuring setup, we observed heavy-tailed noise. A his- togram of the noise values for one specific device is shown in Fig. 1. When the electromigration was done in nitrogen, 70% of the switch- ing devices presented a noise at 0V that fitted well to a mixture of two Gaussian with the same zero mean. This proportion was simi- lar for other mediums. The heavy-tailed noise is also present in the measurements of simple resistances, so we concluded that it was due to the instrument used to perform the measurements. Assuming that the noise distribution is Gaussian for the CUSUM algorithm, as in [7], is sub-optimal in this case. However, imple- menting the algorithm for the Gaussian mixture is unfortunately computationally more expensive due to repeated calculations of ex- ponents and logarithms. We, therefore, propose an approximation to ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | 978-1-7281-6327-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICASSP49357.2023.10096779