An End-to-End System for Organizing and Sharing Raw and Derived Mass Spectrometry Data Cosmin Stejerean 1 , Paiboon Siwamutita 2 , E. D. Frank 3 , Carol S Giometti 4 , Gyorgy Babnigg 5 , David Angulo 6* , Kevin Drew 7 , Gregor von Laszewski 8 1 DePaul University, cosmin@cti.depaul.edu 2 DePaul University, paisi@yahoo.com 3 Argonne National Laboratory, efrank@mcs.anl.gov 4 Argonne National Laboratory, csgiometti@anl.gov 5 Argonne National Laboratory, gbabnigg@anl.gov 6 DePaul University, dangulo@cti.depaul.edu 7 University of Chicago, kdrew@cs.uchicago.edu 8 Argonne National Laboratory, gregor@mcs.anl.gov Abstract With the increasing amount of work being performed in the field of Mass Spectrometry (MS), a huge amount of data is being generated. This data needs to be properly managed, organized and shared among researchers at various institutions. The problem is further complicated by the different proprietary formats used by manufacturers of MS machines. We demonstrate an end-to- end system to automate the process of converting the data to an open format, and to upload the data to a centralized server where it is easily organized and managed. The system allows scientists to browse, download, and use the data with third party tools. The user-view is simple and hides the underlying data-management system. 1 Introduction Mass spectrometry (MS), applied to proteomics, is a method for identifying molecules by their mass-to-charge ratio. MS machines sort and measure the mass and charge of individual charged molecules (ions). These ions must be formed by getting them into the gas phase (desorption) and adding one or more protons or electrons to the protein or peptide fragment. The major ionization and desorption methods for proteins are electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI). The ions can be fragmented further in the MS machine to get more information about the protein such as the amino acid sequence or the presence of post-translational modifications. [BUC05] Tandem mass spectrometry is one of the most sensitive and most reliable methods of protein identification currently available. There are many manufacturers who are competing to produce MS machines: Applied Biosystems (ABI) [ABI], IonSpec [IONSPEC], Waters Corporation (MicroMass equipment) [WATERS], ThermoFinnigan [TFG], Agilent, etc. The manufacturers * Contact author