RELATIVE PROTEIN QUANTITATION WITH POST TRANSLATIONAL MODIFICATIONS IN MASS SPECTROMETRY BASED PROTEOMICS Jens Allmer Molecular Biology and Genetics, Izmir Institute of Technology Gulbahce Campus, Urla, Izmir, 35430, Turkey phone: + (90) 232 750 7517, fax: + (90) 232 750 7509, email: jens@allmer.de web: http://jens.allmer.de ABSTRACT Mass spectrometry has become the tool of choice for most investigations in proteomics. Identification of proteins from complex mixtures has long been achieved and is now rou- tinely used in countless high throughput studies. Quantita- tion by mass spectrometry is comparably newer and many different strategies have been proposed. One such strategy quantitates the difference in protein expression level among samples via extracted ion chromatograms, or spectral counts or a combination thereof. Another strategy involves mass modifications of the analytes in one or more of the samples under investigation. MSMAG has been developed as an extension to 2DB and it has been shown that it can aid in quantitation of data from experiments employing label-free quantitation. Recently, it has been extended to allow for analysis of data based on labelling strategies. This also makes it possible to quickly visualize and investigate inher- ent mass differences as presented by post translational modi- fications. 1. INTRODUCTION Proteomics aims to elucidate the protein complement of a genome taking into account the spatial and temporal expres- sion patterns of proteins. Historically one gene was thought to produce one protein; today we know that due to alterna- tive splicing, RNA editing, and protein splicing this is hardly the case and one gene usually gives rise to multiple proteins with different sequences [1,2]. This creates a large number of possible gene products which is further increased by post translational modifications (PTMs) in which chemical groups are attached to amino acids. Adding PTMs to amino acids in the sequence does not lead to a new protein since the sequence remains the same but rather creates a new pro- tein species [3,4]. Systems biology is interested in all protein species expressed by a genome, their interactions, expres- sion patterns, locations, and their absolute and relative quan- tity under all possible conditions [5]. While the goal seems elusive it is important to create the tools necessary to enable steps towards the overall aim of systems biology. Mass spectrometry (MS) has become the tool of choice when studying proteins [6]. Due to the complexity of MS data many software tools [7-9] have been proposed for analysis and different data storage facilities have been de- veloped [10-12]. We recently developed 2DB, an application for storage, analysis and presentation of results from MS/MS analyses which was first introduced at the 2007 HIBIT and whose improvements have been presented at the 2009 HIBIT symposium [13]. In addition to protein identifi- cation, which MS has long been used for, protein quantities and their relative differences among samples can now be examined. Two general strategies are employed for quantita- tion. One of these involves differential labelling of peptides in different samples with a marker which changes the mass such that the mass to charge ratio (m/z) difference between labelled and unlabelled analytes can be differentiated by a mass spectrometer [14-16]. Labelling implies an additional effort which involves extra cost and an increase in labour time as well as a potential raise in sample complexity [17] whereas label-free quantitation may be done without extra effort. Relative protein quantities can be determined without the need for adding a label to the analytes based on the no- tion that protein abundance and number of spectra and in- tensity of precursor ions are correlated [18-20]. One method to perform label-free quantitation with liquid chromatogra- phy (LC) MS/MS data is to perform spectral counting [21- 23]. In addition to spectral counting, the count can be weighted by the total ion current (TIC) [24,25] or the re- ported score of the identification software [26]. MSMAG is an extension to 2DB and can be used for label- free quantitation essentially as a new feature in the analysis and presentation module while the underlying data model remains unchanged [25]. Since PTMs are also modeled in the database, it was natural to extend the quantitation facility and allow for quantitation of labeled analytes. Viewing labels as an arbitrary change in the measurable mass to charge ratio (m/z) opens the possibility to treat inherent PTMs as if they were labels. With this notion, MSMAG has been extended to allow for visualization of the distribution of PTMs over the fractions of a sample. Additionally, the interactive quantita- tion of labels and PTMs among samples has been enabled and thus relative quantities of modified and unmodified ana- lytes can be investigated. A dataset has been created in silico to test the new features. This dataset will be presented in the next section.