A new approach to automated peak detection Kristin H. Jarman * , Don S. Daly, Kevin K. Anderson, Karen L. Wahl Pacific Northwest National Laboratory, P.O. Box 999/MS K5-12, Richland, WA 99352, USA Received 26 October 2001; received in revised form 6 June 2003; accepted 13 June 2003 Abstract Spectral peak detection algorithms are often difficult to automate because they either rely on somewhat arbitrary rules, or are tuned to specific spectral peak properties. One popular approach detects peaks where signal intensities exceed some threshold. This threshold is typically set arbitrarily above the noise level or manually by the user. Intensity threshold-based methods can be sensitive to baseline variations and signal intensity. Another popular peak detection approach relies on matching the spectral intensities to a reference peak shape. This approach can be very sensitive to baseline changes and deviations from the reference peak shape. Such methods can be significantly challenged by modern analytical instrumentation where the baseline tends to drift, peaks of interest may have a low signal to noise (S/N) ratio, and no well-defined reference peak shape is available. We present a new approach for spectral peak detection that is designed to be generic and easily automated. Employing a histogram-based model for spectral intensity, peaks are detected by comparing the estimated variance of observations (the x- axis of the spectrum) to the expected variance when no peak is present inside some window of interest. We compare an implementation of this approach to two existing peak detection algorithms using a series of simulated spectra. D 2003 Elsevier B.V. All rights reserved. Keywords: Peak detection; Peak identification; Matrix-assisted laser desorption/ionization mass spectrometry; MALDI mass spectrometry 1. Introduction Effective peak detection and characterization remains an important issue in analytical instrumenta- tion. Spectral or chromatographic peaks provide the user with much of the sample composition informa- tion. As a result, analysts rely heavily on the output of peak detection and characterization algorithms for scientific interpretation and understanding. Many peak detection algorithms exist, however, surprisingly little has been published in the open literature that fully describes and provides performance estimates of algorithms currently in existence. To detect or identify the presence of peaks, tradi- tional algorithms generally focus solely on signal intensities. Used since the early 1970s, the original approach involves two steps: (1) characterizing the instrument noise level (such as the standard deviation of intensity values in the absence of peaks) and (2) identifying sequences of intensity values that exceed a critical threshold constructed from the baseline noise level (such as 10 standard deviations). This type of approach can be viewed as a one-sided 0169-7439/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0169-7439(03)00113-8 * Corresponding author. Tel.: +1-509-375-4539; fax: +1-509- 375-2604. E-mail address: Kristin.jarman@pnl.gov (K.H. Jarman). www.elsevier.com/locate/chemolab Chemometrics and Intelligent Laboratory Systems 69 (2003) 61 – 76