Soft Comput (2014) 18:2377–2384
DOI 10.1007/s00500-013-1211-7
METHODOLOGIES AND APPLICATION
Spatio-temporal hotspots and application on a disease analysis
case via GIS
Ferdinando Di Martino · Salvatore Sessa ·
Umberto E. S. Barillari · Maria Rosaria Barillari
Published online: 3 January 2014
© Springer-Verlag Berlin Heidelberg 2013
Abstract Hotspot analysis is a spatial analysis that uses
cluster techniques for determining areas with elevated con-
centrations of localized events. We use the consolidated
Extended Fuzzy C-Means algorithm to determine the hotspot
areas on the map as circles, moreover the advantages of
this technique are the linear computational complexity, the
robustness to noise and outliers, the automatic determination
of the optimal number C of clusters (in the classical FCM
algorithm C is chosen a priori). Furthermore it prevents the
problem of shifting the clusters with low density area of data
points in areas with higher density of such points. We apply
this method to study the spatio-temporal variations of the
hotspot areas by testing this process on a specific disease
problem, precisely we have clusterized 5,000 point-events
correspondent to cases of brain cancer detected in the state
of New Mexico from 1973 to 1991. We also show that the
same results are obtained by using the Extended Gustafson–
Kessel algorithm which gives elliptical clusters. We have
implemented both algorithms in a Geographic Information
System environment. Thus we establish the areas which seem
not interested from the incidence of the disease and those
areas in which the phenomenon appears to be temporarily
attenuated either increased or constant or quite disappeared.
Keywords Brain cancer · EFCM · EGK · GIS · Hotspot
Communicated by G. Acampora.
F. Di Martino · S. Sessa (B )
Dipartimento di Architettura, Università degli Studi di Napoli
Federico II, Via Monteoliveto 3, 80134 Naples, Italy
e-mail: sessa@unina.it
U. E. S. Barillari · M. R. Barillari
Dipartimento di Psichiatria, Neuropsichiatria Infantile, Audiofoniatria
e Dermatovenereologia, Seconda Università degli Studi di Napoli,
L.go Madonna delle Grazie, 80138 Naples, Italy
1 Introduction
It is well known the concept of buffer area in a Geographic
Information System (GIS). Given an event spatially geo-
referenced as a point with geographical coordinates on the
map, a simple buffer area is given by a circle centered at
that point. For example, the epicenter of an earthquake is
represented from a point, which is center of a circle consid-
ered as influence area of that event and its radius is usually
called the buffer distance. The buffer area is usually called
hotspot (Chainey et al. 2002). Another typical example of
hotspot is a circle having center at the geographical local-
ization of a criminal event or a disease event. A complex
hotspot can be represented as the result of a merge operation
by intersecting circular buffer areas. Indeed, if we have a set
of geo-referenced event data, we can determine the geomet-
rical form of the resulting hotspot by intersecting circular
hotspots around each point-event. But when we manipulate
a huge number of point-events, we cannot merge circular
buffer areas or to use an interactive method to determine the
hotspot areas. Generally speaking, in this case we must use
a cluster algorithm in which the patterns are formed by the
point-events and the features are the geographical coordi-
nates of those points. The usage of fuzzy clustering algo-
rithms is well known (see, e.g., Loia et al. 2004; Chertov
and Aleksandrova 2013) and it has been recently proved suc-
cessfully in studying medical datasets (Polat 2012; Wei et al.
2012; Avogadri and Valentini 2009; Masulli and Schenone
1999; Windischberger et al. 2003; Zhang and Chen 2004),
while for GIS in medicine we recall, for instance, Kobayashi
et al. (2010) and Mullner et al. (2004).
The clustering algorithms used to determine the shape of
each hotspot are the density estimation methods (Gath and
Geva 1989; Krishnapuram and Kim 2002), that use a concept
of point density to estimate the form of the clusters, and the
123