Automated interpretation of low-energy collision- induced dissociation spectra by SeqMS, a software aid for de novo sequencing by tandem mass spectrometry SeqMS, a software aid for de novo sequencing by tandem mass spectrometry (MS/ MS),whichwasinitiallydevelopedfortheautomatedinterpretationofhigh-energycolli- sion-induced dissociation (CID) MS/MS spectra of peptides, has been applied to the interpretation of low-energy CID and post-source decay (PSD) spectra of peptides. Basedonpeptidebackbonefragmentedionsandtheirrelatedions,whicharethedom- inantionsobservedinthelattertwotechniques,thetypesofionsandtheirpropensities to be observed have been optimized for efficient interpretation of the spectra. In a typi- calexample,themodifiedSeqMSallowedthecompletesequencingofa31-aminoacid synthetic peptide, except for the isobaric amino acids (Leu or Ile, and Lys or Gln), basedononlythelow-energyCID-MS/MSspectrum. Keywords: Software program / Low-energy collision-induced dissociation / Peptide sequencing / Tandemmassspectrometry EL3951 Jorge Fernandez-de- Cossio 1,3 Javier Gonzalez 3 Yoshinori Satomi 1 Takaki Shima 2 Nobuaki Okumura 2 Vladimir Besada 3 Lazaro Betancourt 3 Gabriel Padron 3 Yasutsugu Shimonishi 1 Toshifumi Takao 1 1 DivisionofOrganic Chemistry, 2 DivisionofProtein Metabolism,Institutefor ProteinResearch,Osaka University,Suita, Osaka,Japan 3 CenterforGenetic Engineering and Biotechnology, Havana,Cuba 1 Introduction Massspectrometry(MS)iswidelyacceptedasapowerful method for the rapid and high-throughput analysis of lim- ited amounts of proteins. Genome projects have served as a stimulus for proteome projects which, in turn, have been greatly aided by MS and related techniques [1]. The latter involve the use of software for MS data processing and interpretation, in conjunction with database search- ing. The software programs which aid in performing ªpep- tidemassfingerprintingº[2],basedonMSdata,represent tools which are in common use, in conjunction with a sequence database. In situations where several candi- date sequences are obtained as a result of database searching based on peptide masses, a software program which aids in database searching based on MS/MS frag- mentions[3±5],orthemeasurementofexactmasses[6], will be required in order to obtain final identification of the target protein. In case of unknown peptides or proteins, de novo sequencing, based on tandem mass spectrome- try(MS/MS)data,isrequired,butonlyalimitednumberof programs that aid in sequencing by MS/MS are currently available [6±11]. We earlier reported a software program, ªSeqMSº, de- signed to aid in de novo sequencing, which is based on high-energyCID-MS/MSspectra[7].Inthisstudy,SeqMS has been improved so as to permit the interpretation of other types of MS/MS data, such as those obtained via low-energy CID-MS/MS and PSD methods. The utility and efficacy of the software program is demonstrated by testing forty-two low-energy CID and PSD spectra of syn- theticandproteolyticpeptides.Thespectrawereobtained using a hybrid quadrupole orthogonal acceleration tan- dem mass spectrometer, equipped with a nanoelectro- spray source, or a matrix-assisted laser desorption/ioni- zation (MALDI) time-of-flight (TOF) mass spectrometer, respectively. 2 Materials and methods 2.1 Chemicals and materials Cytochrome c, b-lactoglobulin, horse myoglobin, peptide # 34, and a-cyano-4-hydroxycinnamic acid (a-CHCA) were purchased from Sigma (St. Louis, MO, USA). The peptides (# 30, 31, 35, 36, 42) were purchased from the Peptide Institute (Osaka, Japan). The synthetic peptides (#25±29,32,33)werepreparedwitha9050peptidesyn- thesizer (PerSeptive Biosystems, Framingham, MA, USA). The peptides were cleaved from the resins and deprotected by treatment with a cocktail composed of tri- fluoroacetic acid, thioanisole and m-cresol (90:5:5 v/v/v) for 2 h at 30 o C. The crude materials were purified by reversed-phase high performance liquid chromatography Correspondence: Dr. Toshifumi Takao, Institute for Protein Re- search, Osaka University, Yamadaoka 3-2, Suita, Osaka 565- 0871,Japan E-mail: tak@protein.osaka-u.ac.jp Fax: +81-6-68798603 Abbreviation: a-CHCA, a-cyano-4-hydroxycinnamicacid 1694 Electrophoresis 2000, 21,1694±1699 WILEY-VCHVerlagGmbH,69451Weinheim,2000 0173-0835/00/0909-1694 $17.50+.50/0