HI-Bone:A Scoring System for Identifying Phenylisothiocyanate- Derivatized Peptides Based on Precursor Mass and High Intensity Fragment Ions Yasset Perez-Riverol, †,‡,# AnielSa( nchez, †,⊥,# Jesus Noda, Diogo Borges, Paulo Costa Carvalho, § RuiWang, Juan Antonio Vizcaíno, La( zaro Betancourt, Yassel Ramos, Gabriel Duarte, Fabio C.S. Nogueira, Luis J.Gonza ( lez, Gabriel Padro ( n, David L. Tabb, @ Henning Hermjakob, Gilberto B. Domont,* ,⊥ and Vladimir Besada* ,† Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ave 31 e/158 y 190, Cubanaca ( n, Playa, Ciudad de la Habana, Cuba EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, U.K. § Laboratory for Proteomics and Protein Engineering, Carlos Chagas Institute, Fiocruz-Parana ( , Brazil Systems Engineering and Computer Science Program, COPPE,Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Proteomics Unit, Institute of Chemistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil @ Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States * S Supporting Information ABSTRACT: Peptide sequence matching algorithms used for peptide identification by tandem mass spectrometry (MS/MS) enumerate theoretical peptides from the database, predict their fragment ions,and match them to the experimental MS/MS spectra. Here,we present an approach for scoring MS/MS identifications based on the high mass accuracy matching of precursor ions,the identification of a highintensityb1 fragment ion, and partialsequence tagsfrom phenyl- thiocarbamoyl-derivatized peptides. This derivatization process booststhe b1 fragment ion signal, which turns it into a powerful feature for peptide identification. We demonstrate the effectiveness of our scoring system by implementing it on a computational tool called “HI-bone” and by identify of an Escherichia coli sample acquired on an Orbitrap Velos instrument using Higher-energy C-trap dissociation. Following this strategy, we identified 1614 peptide spectrum matches with a peptide false discovery rate (FDR) below 1%. These results were significantly higher than those from Mascot and SEQUEST using a similar FDR. P rotein identification in large-scale shotgun proteomics experiments is usuallyaccomplished by automatically comparing theoretical massspectra from peptides generated from a protein sequence database to thoseexperimentally obtained typically by liquid chromatography coupled online with tandem mass spectrometry (LC−MS/MS). Examples of software tools for automatically performing this peptide spectrum matching (PSM) task are search engines such as SEQUEST, 1 Mascot, 2 X!Tandem, 3 and OMSSA. 4 In general terms, the specificity of a PSM algorithm is inversely proportional to the peptide search space size. As such, thesestrategies areusuallymoreefficient in experiments addressing modelorganisms thathavea smalland well- annotated protein sequence database derived from its genome (e.g.,Escherichia coli). On the other hand,the current PSM algorithms can frequently use only a small number of allthe generated high-quality MS/MS spectra in the experiment. The number of peptides generated after the proteolysis of complex samples still overwhelms the capacity of analysis of the most advanced LC−MS systems. As a result, unfortunately only a relatively small proportion of the acquired MS/MS spectra yieldspositive identifications, dueeitherto poorspectrum quality or to insufficiently optimized scoring methods. Taken together, suchaspectsmightsignificantly limit the PSM working models. These limitations motivated us to rethink how the experimental design of traditional PSM approaches is accomplished. Here,we proposea methodology to ultimately provide increased sensitivity whenanalyzingphenylthiocarbamoyl- derivatized peptides (first step ofthe Edman degradation reaction). This derivatization process boosts the b1 fragment ion intensity and simplifies the number of fragments in the M MS spectrum, turning it into a powerful feature that can be Received: November 12, 2012 Accepted: February 28, 2013 Published: February 28, 2013 Technical Note pubs.acs.org/ac © 2013 American Chemical Society 3515 dx.doi.org/10.1021/ac303239g | Anal.Chem. 2013,85, 3515−3520