Jens Mattow 1 Peter R. Jungblut 1 Eva-Christina Müller 2 Stefan H. E. Kaufmann 1 1 Max-Planck-Institute for Infection Biology, Berlin, Germany 2 Max Delbrück Centre for Molecular Medicine, Berlin, Germany Identification of acidic, low molecular mass proteins of Mycobacterium tuberculosis strain H37Rv by matrix-assisted laser desorption/ionization and electrospray ionization mass spectrometry Matrix-assisted laser desorption/ionization-mass spectrometry peptide mass map- ping and nano-electrospray ionization tandem mass spectrometry were used to iden- tify acidic, low molecular mass proteins of Mycobacterium tuberculosis strain H37Rv. Proteins were extracted from whole cell lysates of mycobacteria, separated by high resolution two-dimensional electrophoresis (2-DE) and analysed by mass spectrom- etry (MS). Silver-stained 2-DE patterns resolved about 1800 distinct protein species, 190 of which had an observed isoelectric point and molecular mass in the range of pH 4 to 6 and 6 to 15 kDa, respectively. Seventy-six spots from this range were excised from Coomassie Brilliant Blue G250-stained gels and analysed by MS, from which 72 were identified. These spots were shown to represent products of as many as 50 dif- ferent protein-coding genes. Ten genes gave rise to more than one protein species. Eleven spots contained more than one protein. The present study led to the identifica- tion of 15 mycobacterial proteins with assigned putative functions, 28 conserved hypothetical proteins and one unknown protein. Most proteins of the latter two groups had previously been predicted at the DNA level only. Six additional spots were shown to comprise proteins encoded by open reading frames that have not been predicted for M. tuberculosis H37Rv by genomic investigations. Keywords: Mycobacterium tuberculosis / Low molecular mass proteins / Proteomics / Two-dimensional electrophoresis / Mass spectrometry / Peptide mass mapping PRO 0053 1 Introduction Tuberculosis (TB), caused by Mycobacterium tuber- culosis, is one of the most prevalent infectious diseases. Each year about eight million new cases of TB are notified globally, two million of which prove fatal [1]. In 1998 the entire DNA sequence of M. tuberculosis H37Rv was pub- lished [2]. In the meantime, the genome projects of six additional mycobacterial strains are nearing completion [3]. These include Mycobacterium bovis, Mycobacterium leprae and another virulent strain of M. tuberculosis, the clinical isolate CDC1551. Together with the development of rapid and highly sensitive mass spectrometric methods for protein identification, the availability of the entire DNA sequence of M. tuberculosis H37Rv has paved the way for high-throughput proteomic investigations on myco- bacteria in the last years. These are mostly based on the identification and characterisation of mycobacterial pro- teins separated by 2-DE. The proteome of an organism or cell reflects its functional status in response to physiological and environmental conditions. Proteomics can be used to complement genomic investigations. The expression of predicted pro- tein-coding ORFs needs to be verified by the identifi- cation of expressed proteins and assessment of protein sequences. Proteomics provides the opportunity to determine which ORFs of a genome are actually trans- lated into functional proteins: the functional genome. Furthermore, proteomics may be used to examine the cellular and subcellular distribution of proteins and their relative concentrations. Due to differential pre-mRNA splicing and differential co- and post-translational protein modifications, a gene can give rise to more than one pro- tein product. In contrast to investigations performed on the DNA or RNA level, proteomics provides the oppor- tunity to determine whether proteins exist in multiple protein species, and to study their extent of co- and post-translational modifications. In genome projects ORFs are predicted by bioinformatic routine methods, including codon usage, positional base preference and database searches. Initially, 3924 protein-coding ORFs were predicted for M. tuberculosis H37Rv using these Correspondence: Dr. Jens Mattow, Max-Planck-Institute for InfectionBiology, Department of Immunology, Schumannstr. 21/ 22, D-10117 Berlin, Germany E-mail: mattow@mpiib-berlin.mpg.de Fax: +49-30-28460-501 Abbreviations: AA, amino acids; GroES, 10 kDa chaperone; HspX, 14 kDa antigen; PMM, peptide mass mapping; SC, sequence coverage; TB, tuberculosis 494 Proteomics 2001, 1, 494–507 WILEY-VCH Verlag GmbH, 69451 Weinheim, 2001 1615-9853/01/0404–494 $17.50+.50/0