Please cite this article in press as: Guo, F., Fang, Y., Individual driver risk assessment using naturalistic driving data. Accid. Anal. Prev. (2012),
http://dx.doi.org/10.1016/j.aap.2012.06.014
ARTICLE IN PRESS
G Model
AAP-2815; No. of Pages 7
Accident Analysis and Prevention xxx (2012) xxx–xxx
Contents lists available at SciVerse ScienceDirect
Accident Analysis and Prevention
j ourna l h o mepage: www.elsevier.com/locate/aap
Individual driver risk assessment using naturalistic driving data
Feng Guo
a,∗
, Youjia Fang
b
a
Department of Statistics, Virginia Tech Transportation Institute, Virginia Tech, 406A Hutcheson Hall, Blacksburg, VA 24061-0439, USA
b
Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA
a r t i c l e i n f o
Article history:
Received 30 November 2011
Received in revised form 6 June 2012
Accepted 18 June 2012
Keywords:
Individual driver risk
Naturalistic Driving Study
NEO-5 Personality inventory
Critical incident
K-mean cluster
a b s t r a c t
Driving risk varies substantially among drivers. Identifying and predicting high-risk drivers will greatly
benefit the development of proactive driver education programs and safety countermeasures. The objec-
tive of this study is twofold: (1) to identify factors associated with individual driver risk and (2) predict
high-risk drivers using demographic, personality, and driving characteristic data. The 100-Car Naturalis-
tic Driving Study was used for methodology development and application. A negative binomial regression
model was adopted to identify significant risk factors. The results indicated that the driver’s age, personal-
ity, and critical incident rate had significant impacts on crash and near-crash risk. For the second objective,
drivers were classified into three risk groups based on crash and near-crash rate using a K-mean cluster
method. The cluster analysis identified approximately 6% of drivers as high-risk drivers, with average
crash and near-crash (CNC) rate of 3.95 per 1000 miles traveled, 12% of drivers as moderate-risk drivers
(average CNC rate = 1.75), and 84% of drivers as low-risk drivers (average CNC rate = 0.39). Two logistic
models were developed to predict the high- and moderate-risk drivers. Both models showed high predic-
tive powers with area under the curve values of 0.938 and 0.930 for the receiver operating characteristic
curves. This study concluded that crash and near-crash risk for individual drivers is associated with crit-
ical incident rate, demographic, and personality characteristics. Furthermore, the critical incident rate is
an effective predictor for high-risk drivers.
© 2012 Elsevier Ltd. All rights reserved.
1. Introduction
The substantial variation in individual driving risk has been doc-
umented in many studies (Deery and Fildes, 1999; Ulleberg, 2001;
Dingus et al., 2006). Identifying factors associated with individ-
ual driving risk and predicting high-risk drivers will enable proper
driver-behavior intervention and safety countermeasures to reduce
the crash likelihood of high-risk groups and improve overall driving
safety.
Traffic safety research involves drivers, vehicles and driving
environment. There are extensive literatures on the safety impact
of transportation infrastructure and traffic characteristics, e.g.,
the impacts of intersection design features, pavement conditions,
weather, and traffic flow conditions (Hauer et al., 1988; Poch and
Mannering, 1996; Maze et al., 2006; Guo et al., 2010; Lord and
Mannering, 2010). Crash occurrence is the primary risk measure for
infrastructure-related safety impact evaluation, with Poisson and
negative binomial (NB) models being the state-of-practice analysis
tools. However, there are limited researches on individual driver
risk in traffic and human factor engineering fields.
∗
Corresponding author. Tel.: +1 540 231 1038; fax: +1 540 231 3863.
E-mail addresses: feng.guo@vt.edu (F. Guo), youjia@vt.edu (Y. Fang).
Contrary to traffic engineers, the insurance and actuarial science
industries have a long history of research on classification of drivers
according to risk level to facilitate underwriting and pricing. Esti-
mation of the occurrence of claims based on the driver’s age and
other relevant variables has been a standard practice in actuarial
research (Segovia-Gonzalez et al., 2009). For the insurance industry,
quantified individual risk is directly related to the risk classification
standards (Walters, 1981). However, insurance data are proprietary
and, in general, not available for public access.
Individual driver risk can be affected by many factors. Besides
demographic variables such as age and gender, driver personal-
ity – commonly measured by the NEO five traits inventory or
Zuckerman’s Sensation Seeking Scale, – also plays an important
role in individual driving risk (Costa and McCrea, 1992). Studies
have shown the association between personality characteristics
and risky driving behavior (Jonah, 1997; Jonah et al., 2001; Ulleberg
and Rundmo, 2003; Dahlen and White, 2006; Machin and Sankey,
2008).
Driver behavior plays a central role in driver risk but it is difficult
to measure in real-world driving situations. Recent developments
in vehicle instrumentation techniques, such as in Naturalistic Driv-
ing Study (NDS) (University of Michigan Transportation Research
Institute, 2005; Dingus et al., 2006; Guo and Hankey, 2009) and the
DriveCam system (Hickman et al., 2010) have made it both tech-
nologically possible and economically feasible to monitor driving
0001-4575/$ – see front matter © 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.aap.2012.06.014