GIN-db: a relational scheme for the integration of multi-species gene interaction data. Claude Sabatier (1) , Azzeddine Kadi (2) , Catherine Verheecke-Mauzé (1,2) , Karine Souquet (2) ,David Martin (2) , Magali Lescot (2) , Pierre Mouren (2) , Aitor González González (2) , Anaïs Baudot (2) , Christine Brun (2) , Claudine Chaouiya (2) , Bernard Jacq (2) , and Denis Thieffry (2) (1) Laboratoire d'Informatique Fondamentale, Campus de Luminy, CNRS Case 901, 13288 Marseille, France. Claude.sabatier@lim.univ-mrs.fr (2) IBDM-LGPD, Campus de Luminy, CNRS Case 907, 13288 Marseille, France. {jacq,thieffry}@lgpd.univ-mrs.fr Keywords. Relational database, genetic interaction, molecular interaction, gene ontology Introduction Massive genome sequencing provides us with complete sequence data, which in turn lead to derivation of close to complete gene sets for model organisms. In addition, a complete genome constitutes and excellent framework to inter-connect and organise all molecular and genetic data available for a given organism. More recently, the development of functional genomics is leading to the generation of large amounts of data on gene expression at the transcriptional (transcriptomics) or protein (proteomics) levels, as well as on direct interactions between macromolecules (e.g., double-hybrid method in the case of protein-protein interactions). These fast developments call for corresponding software developments, in order to cope with a proper integration of these various and numerous data. In this context, we are developing a relational database, christened "GIN-db", for Gene Interaction Networks database. This database development supports a series of ongoing bioinformatic studies on the comparative, structural, and dynamic analysis of genetic regulatory networks [1]. Genetic regulation is understood here in a broad sense, covering molecular interactions such as the binding of transcriptional factors to specific pieces of DNA or protein-protein interactions, as well as data coming from classical genetic experiments. Since inter-species comparisons play a crucial role in the building or our understanding of key regulatory pathways and signalling cascades, GIN-db encompasses data from different model organisms, including D. melanogaster, S. cerevisiae, M. musculus and H. sapiens, with the possibility to specify homologie relationships between genes of these species. Gene Interaction Network database (GIN-db) Our aim is to integrate different kinds of functional obtained by various types of experimental approaches, while keeping track of all these specific aspects. Although we work extensively to cure the data included, we believe that all interesting data are not always converging to a clean univocal representation, in particular when one deals with complex functional aspects. Consequently, GIN-db leaves the possibility to the user to confront different types of data and to help him to build his own point of view. This approach is particularly visible in our distinction between two broad types of relationships between molecular entities: "physical relationships", such as complexes or inclusions, versus "functional relationships", such as homologies or interactions (see Figure 1.). Developed with Oracle SGDB 9.i under Solaris, GIN-db allows the encoding of various types of biological interactions: "molecular interactions" such as protein-DNA, RNA-protein, or protein-protein interactions, which are the most frequently encountered in gene networks; but also "genetic interactions". Such interactions can be oriented (as in the case of protein-DNA interactions), but also left un-oriented (either because of a lack of knowledge, or due their very nature).