IPM: An integrated protein model for false discovery rate estimation and identification in high-throughput proteomics Roger Higdon a, b, c , , Lukas Reiter d, e , f , g , Gregory Hather a, b , Winston Haynes a, h , Natali Kolker b, c , Elizabeth Stewart a, b , Andrew T. Bauman a, b , Paola Picotti g , Alexander Schmidt g, i , Gerald van Belle k, l , Ruedi Aebersold g, j , m, n , Eugene Kolker a, b, c , o a Bioinformatics & High-throughput Analysis Laboratory, Seattle, WA, USA b High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA c Predictive Analytics, Seattle Children's Hospital, Seattle, WA, USA d Institute of Molecular Biology, University of Zurich, Zurich, Switzerland e Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland f Ph.D. Program in Molecular Life Sciences Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland g Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland h Hendrix College, Conway, AR, USA i Biozentrum, University of Basel, Basel, Switzerland j Competence Center for Systems Physiology and Metabolic Diseases, Zurich, Switzerland k Department of Biostatistics, University of Washington, Seattle, WA, USA l Deparment of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA, USA m Faculty of Science, University of Zurich, Zurich, Switzerland n Institute for Systems Biology, Seattle, WA, USA o Medical Education and Biomedical Informatics, University of Washington, Seattle, WA, USA ARTICLE INFO ABSTRACT Available online 21 June 2011 In high-throughput mass spectrometry proteomics, peptides and proteins are not simply identified as present or not present in a sample, rather the identifications are associated with differing levels of confidence. The false discovery rate (FDR) has emerged as an accepted means for measuring the confidence associated with identifications. We have developed the Systematic Protein Investigative Research Environment (SPIRE) for the purpose of integrating the best available proteomics methods. Two successful approaches to estimating the FDR for MS protein identifications are the MAYU and our current SPIRE methods. We present here a method to combine these two approaches to estimating the FDR for MS protein identifications into an integrated protein model (IPM). We illustrate the high quality performance of this IPM approach through testing on two large publicly available proteomics datasets. MAYU and SPIRE show remarkable consistency in identifying proteins in these datasets. Still, IPM results in a more robust FDR estimation approach and additional identifications, particularly among low abundance proteins. IPM is now implemented as a part of the SPIRE system. © 2011 Published by Elsevier B.V. Keywords: Protein identification False discovery rate Mass spectrometry Decoy database JOURNAL OF PROTEOMICS 75 (2011) 116 121 Abbreviations: FDR, false discovery rate; FP, false positive; ID, identification; IPM, integrated protein model; LIPS, logistic identification of peptide sequences; MS, mass spectrometry; PSM, peptide spectral match; SPIRE, Systematic Protein Investigative Research Environment; TP, true positive Corresponding author at: SCRI, 1900 Ninth Ave, Seattle, WA 98101, USA. Tel.: +1 206 884 7172; fax: +1 206 987 7660. E-mail address: Roger.Higdon@seattlechildrens.org (R. Higdon). 1874-3919/$ see front matter © 2011 Published by Elsevier B.V. doi:10.1016/j.jprot.2011.06.003 available at www.sciencedirect.com www.elsevier.com/locate/jprot