GRAPH-BASED KNOWLEDGE REPRESENTATION FOR GIS DATA
Manuel Pech Palacio
1
, David Sol
1
, Jesús González
2
{sp205175, sol}@mail.udlap.mx, jagonzalez@inaoep.mx
1
Universidad de las Américas-Puebla
2
Instituto Nacional de Astrofísica Óptica y Electrónica
Puebla, México
Abstract
This paper presents a proposal to create a graph
representation for GIS, using both spatial and non-spatial
data and also including spatial relations between spatial
objects. Because graphs are a powerful and flexible
knowledge representation we will be able to combine
spatial and non-spatial data at the same time and this is
one of the strengths of the proposal. We hope to apply
this knowledge representation to the data mining process
with GIS data including three types of spatial relations:
topological, orientation and distance.
1. Introduction
In the last years the human capabilities in generating
and collecting data have been increasingly widespread.
The explosive growth in data and databases has created a
need for techniques and tools that can transform the data
into useful information and knowledge. In the beginning,
the goals of these techniques and tools were to discover
knowledge that could exist in relational data. Nowadays,
with the growth of the applications that deal with
georeference data, an important increase is noticed in the
management and analysis of spatial data.
Spatial data has many characteristics that distinguish it
from relational data. For example, it has topological,
distance, and direction information organized by
multidimensional spatial indexed structures. Another
difference is the query language that is used to access
spatial data. The complexity of the spatial data type is
another important feature.
Different approaches have been developed for
knowledge discovery from spatial data, next we briefly
present some of them:
Generalization [22][14]. Data and objects often
contain detailed information at primitive concept levels. It
is often desirable to summarize a large set of data and
present it at a high concept level. It assumes the existence
of background knowledge in the form of concept
hierarchies. In the case of a spatial database, there can be
two kinds of concept hierarchies, thematic and spatial. Lu
et al. [22] extended attribute-oriented induction to spatial
databases and presented two algorithms, spatial data
dominant and non-spatial data dominant generalizations.
Clustering [16][23][28][26] can be defined as the
process of grouping physical or abstract objects into
classes of similar objects. Spatial data clustering identifies
clusters, or densely populated regions, according to some
measurement in a large, multidimensional data set.
In many situations it is desirable to explore spatial
associations [19][11] to discover rules which associate
one or more spatial objects with other spatial objects.
There are various kinds of spatial predicates that could
constitute a spatial association rule. Examples include
topological relations like intersects, overlap, disjoint;
spatial orientations like left_of, west_of; and distance
information such as close_to, or far_away.
Approximation and aggregation [17]. Clustering
approaches try to answer questions like where the clusters
in the spatial database can be located. Another problem is
to find out why the clusters are there. We can rephrase the
question to ask about the characteristics of the clusters in
terms of the objects that are close to them. We need to
analyze the objects in the cluster and the objects close to
them.
Finally we have three other methods to discover
knowledge in datasets:
Mining an image database [12][11] can be
viewed as another approach of spatial data
mining.
Classification learning [20] is the task of
assigning an object to a class from a given set
of classes based on the attribute values of the
object.
Spatial Trend Detection [9] can describe a
regular change of one or more non-spatial
attributes of an object that changes its
position in time.
The remainder of this paper is organized as follows:
Sections 2 and 3 present basic topics about spatial and
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE