An ontology-driven approach for the extraction and description of geographic objects contained in raster spatial data Rolando Quintero , Giovanni Guzmán, Rolando Menchaca-Mendez, Miguel Torres, Marco Moreno-Ibarra Intelligent Processing of Geospatial Information Laboratory, Computer Research Center, National Polytechnic Institute, Mexico City, Mexico UPALM-Zacatenco, CIC Building, 07738 D.F. Mexico, Mexico article info Keywords: Raster data Semantics Ontology abstract In this paper, we present FERD, a methodology aimed to automatically identify, extract and describe rel- evant spatial objects contained in raster spatial datasets. Our objective is to provide a set of computa- tional tools capable of finding landforms contained in the datasets that match human-friendly descriptions such as ‘‘In this model there is a mountain having a maximum altitude of 302 m, located between coordinates (19.09383°N, 99.85541°W) and (19.09393°N, 99.85554°W)’’. The proposed method- ology is composed of three main stages: in the first stage (conceptualization), the knowledge domain is represented by means of ontologies. In the second stage (synthesis) a novel semantic decomposition algo- rithm is used to identify and extract relevant spatial objects from the spatial dataset. In the last stage (description), the geographic objects extracted in the second stage are mapped to concepts (objects of the knowledge domain) generated in the first stage. The final result is a set of metadata that describes the geomorphologic objects contained in the raster dataset. Ó 2012 Elsevier Ltd. All rights reserved. 1. Introduction Nowadays, geospatial information is becoming pervasive and people are applying this type of information in an increasing vari- ety of disciplines, from personal leisure applications to large scale governmental systems. This broad spectrum of applications raises new challenges, such as the fact that people are not concerned about the complexity of the data, or the way that it is codified, but at the same time they are requiring more and better geospatial information. Projections, scales and datums are not meaningful to people, they are interested in places to have a lunch and how to get there. Thus, it is necessary to develop methods and algorithms to translate raw data into the type of information that people require. Raster spatial datasets (RSDSs) is a type of data that must be pro- cessed in order to transform raw information in something mean- ingful to a regular person. The process of mapping spatial objects to semantic objects is called semantic representation. In this paper, a methodology (FERD) for semantically extracting and describing objects contained in a RSDS is presented. It is based on three stages: conceptualization, synthesis and description. The goal of the proposed methodology is to build human-friendly representa- tions of spatial objects based on the knowledge that people have cognitively about the geospatial domain. The conceptualization stage attempts to capture and organize common knowledge about the problem’s domain. In other words, in this stage we define concepts that people use when they talk or think about a specific domain (e.g., hydrology, landforms, trans- portation infrastructure, and so on). The stage consists of two tasks: (1) conceptualization of the geospatial domain and (2) con- ceptualization of the particular domain. In a previous work, we have proposed GEONTO-MET which is a methodology to create formal conceptualizations of the geographic domain (Torres, Quin- tero, Moreno-Ibarra, Menchaca-Mendez, & Guzman, 2011). GEON- TO-MET is based on reducing the set of axiomatic relations of a conceptualization in order to obtain a reduced set of basic relations that can be used to define the remaining relations contained in the conceptualization. This way, we obtain more semantic richness. By using GEONTO-MET we developed an ontology (called Kaab) of the geographic domain that is based on the data dictionaries of the mexican National Institute of Statistics, Geography and Informatics (INEGI). Similarly, we used GEONTOMET and the dictionary of the International Standards Organization about the environmental data (ISO, 2005) to create an ontology (called Hunxeet) of the land- forms domain. One of the main advantages of the ontologies cre- ated using GEONTO-MET is that the relationships among concepts are not predefined, but they are part of the conceptualiza- tion itself. The synthesis stage processes the raster spatial datasets (RSDSs) using three classical phases in digital image processing: pre-pro- cessing, processing and post-processing. This stage extracts and 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2012.02.033 Corresponding author. E-mail address: rquintero@ipn.mx (R. Quintero). Expert Systems with Applications 39 (2012) 9008–9020 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa