Soft Comput (2014) 18:2377–2384 DOI 10.1007/s00500-013-1211-7 METHODOLOGIES AND APPLICATION Spatio-temporal hotspots and application on a disease analysis case via GIS Ferdinando Di Martino · Salvatore Sessa · Umberto E. S. Barillari · Maria Rosaria Barillari Published online: 3 January 2014 © Springer-Verlag Berlin Heidelberg 2013 Abstract Hotspot analysis is a spatial analysis that uses cluster techniques for determining areas with elevated con- centrations of localized events. We use the consolidated Extended Fuzzy C-Means algorithm to determine the hotspot areas on the map as circles, moreover the advantages of this technique are the linear computational complexity, the robustness to noise and outliers, the automatic determination of the optimal number C of clusters (in the classical FCM algorithm C is chosen a priori). Furthermore it prevents the problem of shifting the clusters with low density area of data points in areas with higher density of such points. We apply this method to study the spatio-temporal variations of the hotspot areas by testing this process on a specific disease problem, precisely we have clusterized 5,000 point-events correspondent to cases of brain cancer detected in the state of New Mexico from 1973 to 1991. We also show that the same results are obtained by using the Extended Gustafson– Kessel algorithm which gives elliptical clusters. We have implemented both algorithms in a Geographic Information System environment. Thus we establish the areas which seem not interested from the incidence of the disease and those areas in which the phenomenon appears to be temporarily attenuated either increased or constant or quite disappeared. Keywords Brain cancer · EFCM · EGK · GIS · Hotspot Communicated by G. Acampora. F. Di Martino · S. Sessa (B ) Dipartimento di Architettura, Università degli Studi di Napoli Federico II, Via Monteoliveto 3, 80134 Naples, Italy e-mail: sessa@unina.it U. E. S. Barillari · M. R. Barillari Dipartimento di Psichiatria, Neuropsichiatria Infantile, Audiofoniatria e Dermatovenereologia, Seconda Università degli Studi di Napoli, L.go Madonna delle Grazie, 80138 Naples, Italy 1 Introduction It is well known the concept of buffer area in a Geographic Information System (GIS). Given an event spatially geo- referenced as a point with geographical coordinates on the map, a simple buffer area is given by a circle centered at that point. For example, the epicenter of an earthquake is represented from a point, which is center of a circle consid- ered as influence area of that event and its radius is usually called the buffer distance. The buffer area is usually called hotspot (Chainey et al. 2002). Another typical example of hotspot is a circle having center at the geographical local- ization of a criminal event or a disease event. A complex hotspot can be represented as the result of a merge operation by intersecting circular buffer areas. Indeed, if we have a set of geo-referenced event data, we can determine the geomet- rical form of the resulting hotspot by intersecting circular hotspots around each point-event. But when we manipulate a huge number of point-events, we cannot merge circular buffer areas or to use an interactive method to determine the hotspot areas. Generally speaking, in this case we must use a cluster algorithm in which the patterns are formed by the point-events and the features are the geographical coordi- nates of those points. The usage of fuzzy clustering algo- rithms is well known (see, e.g., Loia et al. 2004; Chertov and Aleksandrova 2013) and it has been recently proved suc- cessfully in studying medical datasets (Polat 2012; Wei et al. 2012; Avogadri and Valentini 2009; Masulli and Schenone 1999; Windischberger et al. 2003; Zhang and Chen 2004), while for GIS in medicine we recall, for instance, Kobayashi et al. (2010) and Mullner et al. (2004). The clustering algorithms used to determine the shape of each hotspot are the density estimation methods (Gath and Geva 1989; Krishnapuram and Kim 2002), that use a concept of point density to estimate the form of the clusters, and the 123