Transportation Research Record 1840 123 Paper No. 03-2241 In Belgium, traffic safety is one of the government’s highest priorities. The identification and profiling of black spots and black zones (geo- graphical locations with high concentrations of traffic accidents) in terms of accident-related data and location characteristics must provide new insights into the complexity and causes of road accidents, which, in turn, provide valuable input for governmental actions. Association rules were used to identify accident-related circumstances that frequently occur together at high-frequency accident locations. Furthermore, these pat- terns were analyzed and compared with frequently occurring accident- related characteristics at low-frequency accident locations. The strength of this approach lies with the identification of relevant variables that make a strong contribution toward obtaining a better understanding of accident circumstances and the discerning of descriptive accident pat- terns from more discriminating accident circumstances to profile black spots and black zones. This data-mining algorithm is particularly useful in the context of large data sets for road accidents, since data mining can be described as the extraction of information from large amounts of data. The results showed that human and behavioral aspects are of great importance in the analysis of frequently occurring accident patterns. These factors play an important role in identifying traffic safety prob- lems in general. However, the accident characteristics that were the most discriminating between high-frequency and low-frequency accident locations are mainly related to infrastructure and location. In Belgium, almost 70,000 people are victims of the approximately 50,000 injury-causing accidents that occur in traffic every year, and 1,500 of these victims die. In 1998 the probability of having a deadly accident (relative to the number of vehicle kilometers traveled) in Belgium was almost 35% higher than the European average. On the basis of these figures, Belgium has a bad traffic safety record in comparison with those of most other European countries (1). Not only does the steady increase in traffic intensity pose a heavy bur- den on society in terms of the number of casualties, but the insecu- rity on the roads also has an important effect on the economic costs associated with traffic accidents. Accordingly, traffic safety is one of the highest priorities of the Belgian government. For the past few decades, traffic accident data have been regis- tered and analyzed to support traffic safety policies. The identifica- tion of geographical locations with high concentrations of traffic accidents (black spots and black zones) and the profiling of those locations in terms of accident-related data and location characteristics must therefore provide new insights into the complexity of factors and the criteria that play a significant role in the occurrence of traffic acci- dents to provide valuable input for governmental actions toward traf- fic safety. According to Kononov and Janson (2), it is not possible to develop effective countermeasures without being able to properly and systematically relate accident frequency and severity to a large number of variables, such as roadway geometries, traffic control devices, roadside features, roadway conditions, driver behavior, and vehicle type. Lee et al. (3) indicate that statistical models have been widely used in the past to analyze road crashes to explain the relationship between crash involvement and traffic, geometric, and environ- mental factors. However, Chen and Jovanis (4) demonstrated cer- tain problems that may arise when classical statistical analysis is used with data sets with large dimensions, such as an exponential increase in the number of parameters as the number of variables increases, and the invalidity of statistical tests because of a sparse amount of data in large contingency tables. This is a situation in which data mining comes into play. Data mining can be defined as the non- trivial extraction of implicit, previously unknown, and potentially useful information from large amounts of data (5). The use of data- mining methods can therefore be particularly useful in the context of large data sets containing data on road accidents. In the study described in this paper, a comparative analysis between high-frequency and low-frequency accident locations was conducted to determine the discriminating characters of the accident character- istics of black spots and black zones. In particular, the data-mining technique of association rules was used to obtain a descriptive analy- sis of the accident data. In contrast to predictive models, the strength of this algorithm lies in the identification of relevant variables that make a strong contribution toward a better understanding of the cir- cumstances in which accidents have occurred. Hereby, the emphasis lies on the interpretation of the results, which will be very important for improving traffic policies and ensuring traffic safety on roads. The paper is organized as follows. First, a formal introduction to the technique of association rules is provided. This is followed by a description of the data set. Next, the results of the empirical study are presented. The paper is completed with a summary of the conclusions and directions for future research. ASSOCIATION RULES The association rules technique is a data-mining technique that can be used to efficiently search for interesting information in large amounts of data. More specifically, the association algorithm pro- duces a set of rules describing underlying patterns in the data by means of the support parameter and the confidence parameter. Infor- mally, the support of an association rule indicates how frequently that rule occurs in the data. The higher the level of support of the rule, the more prevalent the rule is. Confidence is a measure of the reliability of an association rule. The higher the level of confidence of the rule is, the more confident one can be that the rule really uncovers the underlying relationships in the data. It is obvious that investigators Profiling of High-Frequency Accident Locations by Use of Association Rules Karolien Geurts, Geert Wets, Tom Brijs, and Koen Vanhoof Data Analysis & Modeling Group, Faculty of Applied Economics, Limburg Univer- sity, University Campus, 3590 Diepenbeek, Belgium.