R. Meersman, P. Herrero, and T. Dillon (Eds.): OTM 2009 Workshops, LNCS 5872, pp. 574–583, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Efficient Management of
Biomedical Ontology Versions
Toralf Kirsten
1,2
, Michael Hartung
1
, Anika Groß
1
, and Erhard Rahm
1,3
1
Interdisciplinary Centre for Bioinformatics, University of Leipzig
2
Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig
3
Department of Computer Science, University of Leipzig
{gross,hartung,tkirsten}@izbi.uni-leipzig.de,
rahm@informatik.uni-leipzig.de
Abstract. Ontologies have become very popular in life sciences and other do-
mains. They mostly undergo continuous changes and new ontology versions are
frequently released. However, current analysis studies do not consider the on-
tology changes reflected in different versions but typically limit themselves to a
specific ontology version which may quickly become obsolete. To allow appli-
cations easy access to different ontology versions we propose a central and uni-
form management of the versions of different biomedical ontologies. The pro-
posed database approach takes concept and structural changes of succeeding
ontology versions into account thereby supporting different kinds of change
analysis. Furthermore, it is very space-efficient by avoiding redundant storage
of ontology components which remain unchanged in different versions. We
evaluate the storage requirements and query performance of the proposed ap-
proach for the Gene Ontology.
Keywords: Ontology versioning, integration, management.
1 Introduction
Many ontologies have recently been developed and are frequently used in life sci-
ences and other domains. In particular, ontologies are used to annotate resources by
semantically describing their properties. For instance, molecular-biological objects
are annotated with concepts of the well-known Gene Ontology (GO) [4] to describe
their function or to specify the biological processes the objects are involved in. More-
over, personal properties of patients such as diseases can be described by concepts of
medical ontologies, e.g., OMIM (http://www.ncbi.nlm.nih.gov/omim) or NCI Thesau-
rus [12]. Many analysis studies utilize such ontology-based annotations to better un-
derstand the real-world impact of certain observations. For instance, GO annotations
are used for functional profiling [1] of large gene expression datasets to determine the
semantic function of certain sets of heavily expressed genes.
Most ontologies especially in life sciences are frequently changed to capture new
insights or correct previous specifications [5, 14]. Such evolutionary changes typically
include the addition, deletion and modification of concepts, relationships, and attribute
values/descriptions. These changes are incorporated in newer ontology versions that