Accident Analysis and Prevention 40 (2008) 1634–1635 Contents lists available at ScienceDirect Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap Short communication How many accidents are needed to show a difference? Ezra Hauer Department of Civil Engineering, University of Toronto, 35 Merton Street, Apartment 1706, Toronto, ON., M4S 3G4, Canada article info Article history: Received 22 November 2007 Received in revised form 5 March 2008 Accepted 24 March 2008 Keywords: Safety Statistics Study Design abstract When a road safety study is contemplated one has establish how many accidents are needed to reach conclusions with a given level of confidence. Later, when the results are in, one has to be explicit about the confidence with which conclusions are stated. The purpose of this note is to describe a back-of-the envelop way of answering such questions with a precision that is sufficient for practical purposes. © 2008 Elsevier Ltd. All rights reserved. There are time-honored ways of testing statistical hypotheses and of designing studies based upon the anticipation of such tests. These require the use of software or tables, rely on assumptions that are not easy to justify, and produce statements the meaning of which is difficult to communicate simply. The upshot is that null hypothesis test of significance are often misapplied and misinter- preted (Hauer, 2004). An alternative approach is described here. The results of a study are estimates of differences in accident rates or frequencies. Each such estimate has a standard error. The more standard errors sep- arate the estimated difference from 0, the less likely it is that the difference lies on the opposite side of 0. The merit of the suggested approach is in its simplicity and clarity of communication. Let x 1 and x 2 be accident counts for c 1 and c 2 years or kilometer- years. The subscripts 1 and 2 may designate two sets of units, one ‘without’ and the other ‘with’ some feature, or perhaps the same set of units ‘before’ and ‘after’ some change. Let 1 and 2 be two unknown expected accident counts per year or per kilometer-year. The questions are (1) Given x 1 and x 2 how confident one can be that 1 - 2 > 0 or (2) How many must be the accident counts x 1 and x 2 for us to be confident that 1 - 2 > 0. The purpose here is to describe a back-of-the envelop way of answering these questions with a precision that is sufficient for practical purposes, and that relies on concepts that are close enough to intuition to allow the use of ordinary language. The assump- Tel.: +1 416 483 4452. E-mail address: Ezra.Hauer@utoronto.ca. tion will be that accident counts are Poisson distributed, that x 1 and x 2 are statistically independent and that the difference 1 - 2 is estimated by the (x 1 /c 1 ) - (x 2 /c 2 ). From here it follows that the variance of the estimated difference is ( 1 /c 1 )+( 2 /c 2 ) and is esti- mated by x 1 /(c 1 ) 2 + x 2 /(c 2 ) 2 . The subscripts 1 and 2 are chosen so that (x 1 /c 1 )>(x 2 /c 2 ). Let ‘k’ be the distance between the estimate and 0 as measured in standard errors of the estimate. With this (x 1 /c 1 ) - (x 2 /c 2 ) (x 1 /c 2 1 ) + (x 2 /c 2 2 ) = k (1) or x 1 - (c 1 /c 2 )x 2 x 1 + (c 1 /c 2 ) 2 x 2 = k (2) There is an approximate rule of thumb (based on the normal dis- tribution) saying that the true value is within ± one standard error of its estimate with a 65% chance, within ± two standard errors with a 95% chance, and within ± three standard errors with a 99.9% chance. In the circumstances of interest the means 1 and 2 are large enough so that the Poisson is well approximated by the normal distribution and the rule of thumb good enough. After all it hardly matters whether the chance is really 65% or 69% or any similar num- ber. The verbal equivalents to the 65%, 95% and 99.9% values might be that one is “somewhat confident”, “confident” or “virtually cer- tain” that the true value is within one, two or three standard errors of the estimate. Thus, when k > 1 one can be more than “somewhat confident”, when k > 2 one can be more than “confident”, and when k 3 “virtually certain” that 1 - 2 is not 0 or less. Numerical example 1: How confident can one be that there was an improvement? 0001-4575/$ – see front matter © 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.aap.2008.03.013