Journal of Biotechnology 106 (2003) 157–167 Building a BRIDGE for the integration of heterogeneous data from functional genomics into a platform for systems biology Alexander Goesmann a,* , Burkhard Linke a , Oliver Rupp a , Lutz Krause a , Daniela Bartels a , Michael Dondrup a , Alice C. McHardy a , Andreas Wilke a , Alfred Pühler b , Folker Meyer a a Center for Genome Research, Bielefeld University, D-33594 Bielefeld, Germany b Lehrstuhl für Genetik, Fak. für Biologie, Bielefeld University, D-33594 Bielefeld, Germany Received 29 April 2003; received in revised form 7 August 2003; accepted 13 August 2003 Abstract The flood of data acquired from the increasing number of publicly available genomes has led to new demands for bioinformatics software. With the growing amount of information resulting from high throughput experiments new questions arise that often focus on the comparison of genes, genomes, and their expression profiles. Inferring new knowledge by combining different kinds of “post-genomics” data obviously necessitates the development of new approaches that allow the integration of variable data sources into a flexible framework. In this paper, we describe our concept for the integration of heterogeneous data into a platform for systems biology. We have implemented a Bioinformatics Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate the usability of our approach as a platform for systems biology for two sample applications. © 2003 Elsevier B.V. All rights reserved. Keywords: Systems biology; Genome annotation; Data integration 1. Introduction Today, roughly 50–60% of all genes in a newly sequenced bacterial genome can be classified al- most automatically based on sequence similarity (Fraser et al., 2000). A functional annotation can be assigned by using widespread tools like BLAST (Altschul et al., 1997), HMMer (Eddy, 1998), In- * Corresponding author. Fax: +49-521-106-5626. E-mail address: alexander.goesmann@genetik.uni-bielefeld.de (A. Goesmann). terPro (Apweiler et al., 2001) and many others. For the remaining 40–50% it is still a laborious task to identify their function. In particular, these new genes are often the most interesting ones for scien- tific progress or commercial purposes encoding some special features of the organism. Hence, it should al- ways be worthwhile to spend some time and money for their detailed analysis. As shown in Fig. 1, dif- ferent high throughput methods can be applied that support the analysis of uncharacterized genes. Never- theless, detailed single gene analysis methods such as knockout mutants or RT-PCR are still irredeemable 0168-1656/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.jbiotec.2003.08.007