2982982982982989
Journal of Uncertain Systems
Vol.3, No.4, pp.298-306, 2009
Online at: www.jus.org.uk
Implementation of the Extended Fuzzy C-Means Algorithm in
Geographic Information Systems
Ferdinando Di Martino
1,2, ∗
, Salvatore Sessa
1
1
Università degli Studi di Napoli Federico II, Dipartimento di Costruzioni e
Metodi Matematici in Architettura, Via Monteoliveto 3, 80134 Napoli, Italy
2
Università degli Studi di Salerno, Dipartimento di Matematica e Informatica
Via Ponte Don Melillo, 84084 Fisciano, Italy
Received 28 March 2009; Revised 23 June 2009
Abstract
Density cluster methods have elevated computational complexity and are used in spatial analysis for the
determination of impact areas. We propose the extended fuzzy c-means (EFCM) algorithm like alternative method
because it has three advantages: robustness to noise and outliers, linear computational complexity and automatic
determination of the optimal number of clusters. We implement the EFCM algorithm inside a geographic information
systems (GIS) for the determination of buffer areas as hypersphere volume prototypes which are circles in the case of
bidimensional pattern data. Indeed we have applied this algorithm in the spatial analysis of buffer areas called hotspots,
including fire point-events of the Santa Fè district (NM), downloaded from http://www.fs.fed.us/r3/gis/sfe_gis.shtml.
© 2009 World Academic Press, UK. All rights reserved.
Keywords: extended fuzzy c-means, fuzzy c-means, GIS, hotspot
1 Introduction
In spatial analysis a buffer area is an area at a specified distance around to features of a theme. This area is determined
as a polygon by defining distance parameters that can be set as constants or variables, determined by feature attributes:
for instance, circular buffer areas are obtained around a feature of the theme by using the radius of the circle as
distance parameter. Buffer areas are calculated in many fields of the spatial analysis and they can determine
dangerous bounded zones: for examples, areas around an epicenter of an earthquake, areas of industrial pollution,
urban areas where the construction of buildings is forbidden from the local legislation.
The buffering primitive operations in a geographical information system (GIS) concern points, lines and
polygons. In spatial analysis, an area having dimensions of a continent can be considered, with a good approximation,
as a plane and we apply the Euclidean geometry in the calculus of distances. For this reason the buffer area around a
point on the map is formed by a circle (“circular polygon” in terms of analysis spatial) centered in that point. For
instance, the epicentre of an earthquake or the location of a criminal event can be represented from a point. The radius
of this circle is called the buffer distance which is assumed by the user either as a constant value for all the point data
or as the value of a field in the point data table. Moreover the user has two options: to separate these circular buffer
areas (cfr. Fig.1) or to merge some of them by obtaining new polygonal areas (cfr. Fig.2).
When the number of event-points is elevated, the classical density methods are not suitable for the determination
of impact areas because of high computational complexity. Then the usage of cluster algorithms seems more
appropriate: it is well known that the clusters contain similar data and the degree of association is weak between data
of different clusters. Clustering algorithms (e.g., [8, 9, 11, 12, 13, 14]) are useful for the determination of buffer areas,
called hotspots in crime analysis, car crash analysis, disease diffusion analysis, etc. For instance, the National Institute
of Justice at Washington DC (USA) has developed a statistical tool, CrimeSTAT [9], for the GIS analysis of crime
incident locations. We refer to [6] for an exhaustive list of clustering techniques which determine hotspots.
In order to determine the shape of each hotspot we have to use a density estimation method ([8, 13]), whereas the
fuzzy c-means (FCM) algorithm [3] uses punctual cluster prototypes. However in many cases is not necessary to
∗
Corresponding author. Email: fdimarti@unina.it (F. Di Martino).