Selected Paper Using Cross-Correlation with Pattern Recognition Entropy to Obtain Reduced Total Ion Current Chromatograms from Raw Liquid Chromatography-Mass Spectrometry Data Shiladitya Chatterjee, 1 Sean C. Chapman, 1 Barry M. Lunt, 2 and Matthew R. Linford* 1 1 Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT 84602, USA 2 Information Technology, School of Technology, Brigham Young University, Provo, UT 84602, USA E-mail: mrlinford@chem.byu.edu Received: August 15, 2018; Accepted: September 20, 2018; Web Released: October 27, 2018 Matthew Linford Matthew Linford is a professor of chemistry at Brigham Young University. His research lies in the areas of surface and material synthesis and characterization, including the creation of new materials for separation science, and data analysis (chemometrics). He has more than 300 publications. He is an editor for Applied Surface Science, and has served for many years on the editorial board of Surface Science Spectra. For more than three years he has written a ca. monthly article in Vacuum Technology & Coating on surface and material characterization. Abstract Totalion current chromatograms (TICCs) generated by liquid chromatography-mass spectrometry (LC-MS) are prone to noise from chemical and electronic sources. This noise can severely impact the detection of analytes inamixture. Recently, we introduced a new variable selection tool based on Pattern Recognition Entropy (PRE) that selects good quality (high signal-to-noise ratio) mass chromatograms from an LC- MS dataset and thereby creates a reduced TICC with low noise and a at background (J. Chrom. A. 2018, 1558, 21-28). PRE, which is based on Shannons entropy, was shown to be a straightforward and powerful shape recognition toolfor this problem. However, while the chromatographicsignals in the reduced TICC from PRE were well resolved, some noise remained in the TICC, which suggested that the algorithm had selected some false positives, i.e., poor quality mass chromato- grams. In this paper, we report an improved version of the PRE algorithm that utilizes a second variable selection filter based on cross-correlation (CC). As a check on the ability of PRE and CC to select high quality mass chromatograms, every mass chromatogram in our data set (1451 in total) was individually inspected and rated as either high quality (green), intermediate quality (yellow), or poor quality (red). A color-coded plot of the CC value vs. the PRE value for the mass chromatograms was created, which shows that, as expected, the higher quality mass chromatograms are localized in its upper left quadrant, which corresponds to lower PRE values and higher CC values. In our original paper on this topic, we recommended a thresh- oldof 0.5 σ for PRE, which caused the algorithm to select 151 mass chromatograms out of 1451. Of these, 98 were of high quality, 6 were ofintermediate quality, and 47 were of poor quality. Using a second threshold for CC, the algorithm retains all the high and intermediate quality mass chromatograms, while removing all 47 of the poor quality ones. The resulting TICC from the PRE-CC algorithm shows less noise compared to the TICC generated from the PRE approach alone. The PRE-CC algorithm is arguablya faster, simpler and more intuitive approach as compared to the widely used CODA_ DW algorithm. Keywords: LC-MS j Total Ion Current Chromatogram (TICC) j Noise Introduction The totalion current chromatograms (TICCs) obtained in liquid chromatography-mass spectrometry (LC-MS) 1,2 are often limited by high levelsof chemical and other electronic noise, making the subsequent extraction of real chromato- graphic information difficult. 3-5 The noise in TICCs arises from the noise present in their constituent mass chromatograms, which can have both high frequency (transients and/or spikes) and low frequency (baseline drift) components. 6 Hardware approaches optimizing the transfer of eluents from the liquid chromatograph to the mass spectrometer have been devised to reduce chemical noise. 7,8 However, limited success has been achieved through these techniques, and LC-MS analysisoften relies on post-processing of the TICC to obtain adequate infor- mation about analytes. 9 In general, unless noisy mass chroma- tograms are excluded, poor quality TICCs are obtained. Document type: Article Bull. Chem. Soc. Jpn. 2018, 91, 17751780 | doi:10.1246/bcsj.20180230 © 2018 The Chemical Society of Japan | 1775