(8–10). However, the scope of this research was not the develop- ment of predictive models; instead, the goal for the study was the exploratory analysis of pedestrian crashes in Italy to (a) detect inter- dependence among crash patterns, as well as dissimilarities among patterns; (b) find nontrivial and unsuspected relations in the data; and (c) provide insight for the development of safety improvement strategies focused on pedestrians. To achieve these objectives, data-mining techniques, which have been receiving increased attention from road safety researchers, were used to perform analysis of crash data (11). Data mining was used because of the huge quantity of information handled in the study and because of the nature of the crash phenomenon. A crash can be defined as a rare, random, multifactor event always preceded by a situation in which one or more road users fail to cope with the road environment (12). Each crash is the result of a chain of events that is, in its entirety, unique, but some factors are common to sev- eral crash circumstances. The identification of these factors and their interdependences by means of data-mining techniques can provide insight useful for the development of effective countermeasures. Two complementary techniques were used: classification trees and association rules discovery. Classification and regression trees, a nonparametric model without any predefined underlying relationship between the target (dependent) variable and the predictors (independent variables), has been used in such applications as marketing (13), data editing (14), missing data imputation and data fusion (15), and road safety analyses (16–18). When the value of the target variable is discrete, a classification tree is developed, whereas a regression tree is developed for the continu- ous target variable. Because this study aims to explore categorical variables, classification trees were developed. The association rules discovery technique in data mining has been successfully used to uncover obscured patterns or rules in a variety of fields, including market basket analysis, product recommendation, and medical record analysis. Recently, Montella et al. (12) and Pande and Abdel-Aty (11) used association discovery to detect interdepen- dence among crash characteristics. Association rules discovery is the identification of the sets of items (i.e., the crash patterns in the pre- sented study) that occur together in a given event (i.e., a crash in this study) more often than they would if they were independent. METHODOLOGY Classification Trees Tree-based methods are nonlinear and nonparametric data-mining tools for supervised classification and regression problems. They do not require a priori probabilistic knowledge about the phenomena under study, and no assumptions are necessary. A tree is an oriented graph formed by a finite number of nodes departing from the root Data-Mining Techniques for Exploratory Analysis of Pedestrian Crashes Alfonso Montella, Massimo Aria, Antonio D’Ambrosio, and Filomena Mauriello 107 Exploratory analysis was made of data from pedestrian crashes to detect interdependence and dissimilarities between crash patterns and to pro- vide insight for the development of safety improvement strategies focused on pedestrians. Data-mining techniques, such as classification trees and association rules, were used on data related to 56,014 pedestrian crashes that occurred in Italy from 2006 to 2008. Crash severity was the response variable most sensitive to crash patterns. The most influential crash pat- terns were road type, pedestrian age, lighting conditions, vehicle type, and interactions between these patterns. Notable results included associations between fatal crashes and rural areas, urban provincial and national roads, pedestrians older than 75 years, nighttime conditions, pedestrians older than 65 years in nighttime crashes, drivers’ young age and male gen- der in nighttime crashes, and truck involvement. To mitigate the fatal crash patterns identified by the classification trees and association rules, several measures are suggested for implementation. Results of the study are consistent with results of previous studies that used other analytic techniques, such as probabilistic models of crash injury severity. The data-mining techniques used in the study were able to detect interdepen- dencies among crash characteristics. The use of classification trees and association rules, however, must be seen not as an attempt to supplant other techniques, but as a complementary method that can be integrated into other safety analyses. Pedestrian and motor vehicles crashes are a serious problem through- out the world. In Italy, 598 pedestrians were killed as a result of motor vehicles crashes, and pedestrian fatalities represented 13% of motor vehicle-related deaths in 2008. In the European Union, the proportion of pedestrian fatalities was about 17%. The European Commission has proposed halving the overall number of road deaths in the European Union by 2020, by defining the protection of vulnerable road users as a specific objective of the road safety action program (1). In the United States, the 4,378 pedestrian road fatalities accounted for 12% of total fatalities in 2008. To address this problem, Goal 9 in the AASHTO Strategic Highway Safety Plan is to improve pedestrian safety (2). Devising effective countermeasures requires significant research on pedestrian crashes. Most road safety research has focused on the development of safety performance functions (3–7 ), which relate the expected number of crashes to explanatory variables related to the traffic flow, the highway geometry, and the environment, and on the development of probabilistic models of crash injury severity A. Montella and F. Mauriello, Department of Transportation Engineering, and M. Aria and A. D’Ambrosio, Department of Mathematics and Statistics, University of Naples Federico II, Via Claudio 21, 80125 Naples, Italy. Corresponding author: A. Montella, alfonso.montella@unina.it. Transportation Research Record: Journal of the Transportation Research Board, No. 2237, Transportation Research Board of the National Academies, Washington, D.C., 2011, pp. 107–116. DOI: 10.3141/2237-12