Open Journal of Safety Science and Technology, 2012, 2, 98-107 http://dx.doi.org/10.4236/ojsst.2012.23013 Published Online September 2012 (http://www.SciRP.org/journal/ojsst) Outbreak Detection of Spatio-Temporally Smoothed Crashes Ross Sparks, Chris Okugami, Sarah Bolt CSIRO Mathematics, Informatics and Statistics, Sydney, Australia Email: Ross.Sparks@csiro.au Received May 1, 2012; revised June 10, 2012; accepted June 24, 2012 ABSTRACT Spatio-temporal surveillance methods for detecting outbreaks are common with the SCAN statistic setting the bench- mark. If the shape and size of the outbreaks are known, then the SCAN statistic can be trained to efficiently detect these, however this is seldom the case. Therefore devising a plan that is efficient at detecting a range of outbreaks that vary in size and shape is important in practical applications. So this paper introduces a method called EWMA Surveillance Trees that uses a binary recursive partitioning approach to locate and detect outbreaks. This approach is explained and then its performance is compared to that of the SCAN statistic in a series of simulation studies. While the SCAN statis- tic is shown to remain the most effective at detecting outbreaks of a known shape and size, the EWMA Surveillance Trees are shown to be more robust. The method is also applied to an example of actual data from motor vehicle crashes in an area of Sydney Australia from 2000 to 2004 in order to detect dates and geographic regions with outbreaks of crashes above the expected. Keywords: Average Run Length; Exponential Weighted Moving Averages; Monitoring; Spatial Outbreaks; Spatio-Temporal Smoothing; Crash Outbreaks 1. Introduction The SCAN statistic [1] has been successful at prospec- tively detecting space-time clusters. Kulldorff [2-4] has developed SCAN plans and implemented them in the SATSCAN software package for a variety of problems including Bernoulli data, Poisson counts and a space- time permutation model using only case data, amongst others. However there are some important limitations to this approach which will be addressed in this paper. Firstly, the space-time permutation model compares in- cidences to what is expected under the assumption that all cases were independent of each other. That is, the expected values are determined under the assumption that there is no space-time interaction. Secondly, the spa- tio-temporal SCAN statistic has been criticised by Woo- dall et al. [5] and Han et al. [6] for not being as efficient as the CUSUM [7,8] for outbreak detection. Lastly, the ability to detect outbreaks most effectively is dependent on the choice of shape and size of the scanning window. However, the attractiveness of the SCAN technology is that it is easy to understand, and therefore people use it. For this reason, the SCAN statistic is implemented in this paper as a benchmark for comparison. In our implemen- tation, we considered the two dimensional scan statistic used for detecting spatial clusters as discussed in detail in the book by Glaz et al. [9]. To extend the method to the detection of three-dimensional spatio-temporal clusters, we use the lattice structure as outlined in Glaz et al. [9] and then search over this structure for groups of rectan- gular blocks of space and time in order to alarm for un- usually high counts. The counts within the rectangular blocks of space-time are compared to their respective expected counts to measure their unusualness. Bounda- ries of all significant geographical regions are outlined on a map to indicate the geography of the outbreak. The EWMA Surveillance Tree plan that is proposed in this paper addresses all of the concerns raised above. This plan also makes use of the fixed lattice structure since this structure is well suited to the application of Exponentially Weighted Moving Average (EWMA) tem- poral smoothing of the counts. This smoothing improves early detection over the moving average approach sug- gested by Kulldorff and others. Therefore this EWMA smoothing avoids the criticisms by Woodall et al. (2008) and Han et al. (2008). Also, in this paper, we compare incidences to historical expected values where the ex- pected values can be space-time dependent. Therefore clustered outbreaks are signaled in this paper when the counts are higher than expected in a random local region. Lastly, by doing away with the scanning window all to- gether we have removed the need for this parameterisa- Copyright © 2012 SciRes. OJSST