IPM: An integrated protein model for false discovery rate
estimation and identification in high-throughput proteomics
Roger Higdon
a, b, c ,
⁎
, Lukas Reiter
d, e , f , g
, Gregory Hather
a, b
, Winston Haynes
a, h
,
Natali Kolker
b, c
, Elizabeth Stewart
a, b
, Andrew T. Bauman
a, b
, Paola Picotti
g
,
Alexander Schmidt
g, i
, Gerald van Belle
k, l
, Ruedi Aebersold
g, j , m, n
, Eugene Kolker
a, b, c , o
a
Bioinformatics & High-throughput Analysis Laboratory, Seattle, WA, USA
b
High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA
c
Predictive Analytics, Seattle Children's Hospital, Seattle, WA, USA
d
Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
e
Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
f
Ph.D. Program in Molecular Life Sciences Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
g
Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
h
Hendrix College, Conway, AR, USA
i
Biozentrum, University of Basel, Basel, Switzerland
j
Competence Center for Systems Physiology and Metabolic Diseases, Zurich, Switzerland
k
Department of Biostatistics, University of Washington, Seattle, WA, USA
l
Deparment of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA, USA
m
Faculty of Science, University of Zurich, Zurich, Switzerland
n
Institute for Systems Biology, Seattle, WA, USA
o
Medical Education and Biomedical Informatics, University of Washington, Seattle, WA, USA
ARTICLE INFO ABSTRACT
Available online 21 June 2011 In high-throughput mass spectrometry proteomics, peptides and proteins are not simply
identified as present or not present in a sample, rather the identifications are associated with
differing levels of confidence. The false discovery rate (FDR) has emerged as an accepted means
for measuring the confidence associated with identifications. We have developed the
Systematic Protein Investigative Research Environment (SPIRE) for the purpose of integrating
the best available proteomics methods. Two successful approaches to estimating the FDR for
MS protein identifications are the MAYU and our current SPIRE methods. We present here a
method to combine these two approaches to estimating the FDR for MS protein identifications
into an integrated protein model (IPM). We illustrate the high quality performance of this IPM
approach through testing on two large publicly available proteomics datasets. MAYU and SPIRE
show remarkable consistency in identifying proteins in these datasets. Still, IPM results in a
more robust FDR estimation approach and additional identifications, particularly among low
abundance proteins. IPM is now implemented as a part of the SPIRE system.
© 2011 Published by Elsevier B.V.
Keywords:
Protein identification
False discovery rate
Mass spectrometry
Decoy database
JOURNAL OF PROTEOMICS 75 (2011) 116 – 121
Abbreviations: FDR, false discovery rate; FP, false positive; ID, identification; IPM, integrated protein model; LIPS, logistic identification of
peptide sequences; MS, mass spectrometry; PSM, peptide spectral match; SPIRE, Systematic Protein Investigative Research Environment;
TP, true positive
⁎ Corresponding author at: SCRI, 1900 Ninth Ave, Seattle, WA 98101, USA. Tel.: +1 206 884 7172; fax: +1 206 987 7660.
E-mail address: Roger.Higdon@seattlechildrens.org (R. Higdon).
1874-3919/$ – see front matter © 2011 Published by Elsevier B.V.
doi:10.1016/j.jprot.2011.06.003
available at www.sciencedirect.com
www.elsevier.com/locate/jprot