ONTOLOGIES: SOLVING SEMANTIC HETEROGENEITY IN A FEDERATED SPATIAL DATABASE SYSTEM Villie Morocho, F` elix Saltor Universitat Polit` ecnica de Catalunya - LSI c/Jordi Girona 1-3. 08034 Barcelona, Spain vmorocho,saltor @lsi.upc.es Lluis P´ erez-Vidal Universitat Polit` ecnica de Catalunya - IG Av.Diagonal 647,8.08028 Barcelona, Spain lpv@lsi.upc.es Key words: Semantic Heterogeneity, Geographic Integration, Interoperability, Federated Database, XML, UML, GML Abstract: Information integration has been an important area of research for many years, and the problem of integration of geographic data has recently emerged. This paper presents an approach based on the use of Ontologies for solving the problem of semantic heterogeneity in the process of the construction of a Federated Schema in the framework of geographic data. We make use of a standard technology (OMT-G based UML, XMI based XML, GML from OpenGIS) 1 INTRODUCTION Interoperability and integration of heterogeneous data have been some of the goals to achieve during the last few years. From software corporations to world scientific institutions, researchers are working on it. This paper presents a framework based on BLOOM (Abell ´ o et al., 1999) which is based on Federated Database Architecture (Sheth and Larson, 1990). The BLOOM architecture especially adds security levels. In this paper, moreover, we change the scope from tra- ditional Databases to spatial Databases. Inside of this Federated Architecture, at the level of schema integra- tion, we make use of Ontologies for solving Semantic Heterogeneity. Semantically rich information (i.e. metadata, con- text information) is added to Native Schema at the bottom level and this information will aid for assess- ing semantic similarity across ontologies in order to allow the construction of the Federated Schema. In this framework, after the Geospatial Schema level, (in which all models are native), these schemas are transformed into a Canonical Data Model. A Canon- ical Data Model (Castellanos et al., 1992) is a com- mon model for all Component Schemas. We make use, in the first solution, the OMT-G (Borges et al., 2001) as CDM, and we take advantage of the fea- tures of this model. OMT-G provides some primi- tives used for modeling the geometry and topology of geographic data, providing support for “whole-part” topological structures, network structures, multiple views of objects and spatial relationships. In a sec- ond solution, we make use the abstract model from OpenGIS (OpenGIS, 1999). The other part of this paper, proposes to materi- alize the models from OMT-G or OpenGIS in XMI (OMG, 2002). The main purpose of XMI is to enable easy interchange of metadata between modeling tools (based on the OMG-UML(OMG, 2001)) and meta- data repositories(OMG-MOF based), in a distributed and heterogeneous environment. Once the model in XMI is materialized, we con- struct the Ontologies for the objects in the model. Then each object should have its own ontology and, afterwards, match the ontologies of the different schemas to integrate. In this matching process, it is possible to know whether there is a correspondence between Ontologies, and which object is semanti- cally parallel to another. In this way, it is possible to achieve a semiautomatic schema integration. Con- tinuing with levels of the framework, the Federated Schema should be authorized at the level Authorized Schema. They should also be filtered through the Ex- ternal Schema to finally obtain a User Schema at the top of the framework. In this paper we first analyze the integration prob- lem in the section 2 and related research in section 3. We present our architecture in the section 4 and then study the use of OMT-G,GML and XMI in it. Finally, we consider future work in the last section. 1 Proceedings of 5th International Conference on Enterprise Information System, pages 347–352, Angers, France, Apr 2003. ISBN: 972-98816-1a-8.