R. Meersman, P. Herrero, and T. Dillon (Eds.): OTM 2009 Workshops, LNCS 5872, pp. 574–583, 2009. © Springer-Verlag Berlin Heidelberg 2009 Efficient Management of Biomedical Ontology Versions Toralf Kirsten 1,2 , Michael Hartung 1 , Anika Groß 1 , and Erhard Rahm 1,3 1 Interdisciplinary Centre for Bioinformatics, University of Leipzig 2 Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig 3 Department of Computer Science, University of Leipzig {gross,hartung,tkirsten}@izbi.uni-leipzig.de, rahm@informatik.uni-leipzig.de Abstract. Ontologies have become very popular in life sciences and other do- mains. They mostly undergo continuous changes and new ontology versions are frequently released. However, current analysis studies do not consider the on- tology changes reflected in different versions but typically limit themselves to a specific ontology version which may quickly become obsolete. To allow appli- cations easy access to different ontology versions we propose a central and uni- form management of the versions of different biomedical ontologies. The pro- posed database approach takes concept and structural changes of succeeding ontology versions into account thereby supporting different kinds of change analysis. Furthermore, it is very space-efficient by avoiding redundant storage of ontology components which remain unchanged in different versions. We evaluate the storage requirements and query performance of the proposed ap- proach for the Gene Ontology. Keywords: Ontology versioning, integration, management. 1 Introduction Many ontologies have recently been developed and are frequently used in life sci- ences and other domains. In particular, ontologies are used to annotate resources by semantically describing their properties. For instance, molecular-biological objects are annotated with concepts of the well-known Gene Ontology (GO) [4] to describe their function or to specify the biological processes the objects are involved in. More- over, personal properties of patients such as diseases can be described by concepts of medical ontologies, e.g., OMIM (http://www.ncbi.nlm.nih.gov/omim) or NCI Thesau- rus [12]. Many analysis studies utilize such ontology-based annotations to better un- derstand the real-world impact of certain observations. For instance, GO annotations are used for functional profiling [1] of large gene expression datasets to determine the semantic function of certain sets of heavily expressed genes. Most ontologies especially in life sciences are frequently changed to capture new insights or correct previous specifications [5, 14]. Such evolutionary changes typically include the addition, deletion and modification of concepts, relationships, and attribute values/descriptions. These changes are incorporated in newer ontology versions that