Contents lists available at ScienceDirect
Safety Science
journal homepage: www.elsevier.com/locate/safety
Derailment accident risk assessment based on ensemble classification method
Samira Kaeeni, Madjid Khalilian
⁎
, Javad Mohammadzadeh
Department of Computer Engineering, Karaj Branch, Islamic Azad University, Karaj, Iran
ARTICLE INFO
Keywords:
Data mining
Safety risk assessment
Ensemble classification
ABSTRACT
Safety plays important roles in railway transportation industry. Plan and development of safety system requires
sufficient awareness on specific situations which creates unsafe conditions in railway network. Derailment ac-
cident is known as one of the most critical train accident. It is necessary that safety officials of this industry by
taking advantage of the experiences of the past accidents prevent repeating it in the future. Using up-to-date
tools and techniques can create different view from what has been presented by railway safety official. In this
study, a derailment accident risk assessment classification model has been proposed, which may be used for
safety system in railway network. Our model uses the cumulated data on the Iranian Railway accidents database.
Three popular data mining techniques are used to our proposed model in two steps. In the first step, Artificial
Neural Networks, Naïve Bays, and Decision Tree are utilized independently to predict the derailment accident
risk, and each method produces the model of their prediction as a form of probabilities. In the second step,
outcome for each model receives a weight based on its predicting accuracy by using genetic algorithm (GA), and
makes the final decision for derailment accident risk assessment. To validate model efficiency, it was used for a
sample in the Islamic republic of Iran Railway. In the end, it's shown this model presented high-quality in-
formation for predicting accident and GA (Genetic Algorithm) in second step has a significant role in perfor-
mance improvements.
1. Introduction
Nowadays, railways are one of the best choices for transportation to
reduce pollution and avoid traffic congestion. On the other hand, it
includes many advantages such as safety, economy, fast, and a regular
transportation to reach a destination for both passengers and com-
modities. As a result, passengers and shipping organizations prefer
railway to other transportation modes. Therefore, to hold this belief and
achieve a reasonable advantage, railway transportation administrators
should work responsively to raise the level of safety and reduce issues
that cause accidents.
Developing a safety system requires an intelligent system for pre-
dicting an accident based on previous data of accidents on the railway
network and current conditions of vehicles. It can be performed by
using a model for safety risk assessment. It is of extreme importance to
analyze previous data in order to extract a model based on hidden
knowledge among huge amounts of data. There are many well-estab-
lished data mining techniques for data analysis, particularly for nu-
meric data. Effective analysis of data from a database helps model
creation and support safety management strategies, by estimating ac-
cident risk (Mirabadi and Sharifian, 2010). For these purposes, data
mining is used for knowledge extraction from data. Knowledge can be
defined by interesting patterns but the term ‘interesting’ is ambiguous.
Based on the literature, non-trivial, previously unknown, implicit
and potentially useful are characteristics for interesting patterns which
is extracted from the data. Knowledge extraction is the process of
gathering data from different sources (e.g. databases), data preproces-
sing (cleaning, integration, transformation, etc.), statistical summary,
knowledge discovery (data mining), and eventually using extracted
knowledge for decision making. Data mining has two main function-
alities: descriptive and predictive data mining. In classification and
prediction, the model(s) is constructed to describe or distinguish classes
or concepts for future prediction (Han et al., 2011). For example, it is
possible to classify railway accidents into different groups of accidents
based on their features. The main objective of this study is to discover
meaningful patterns and trends among derailment accidents’ data of the
Iranian Railways (RAI).
Derailment is the most important source of rail accident in Iran. One
of the most common fields of transportation to apply data mining is
accident analysis. Very little is known about the usefulness of applying
data mining in railway accident analysis, although there are numerous
applications of data mining in road accident analysis. One main reason
for that may be the limited number of accidents happening on railway
networks compared to those on roads. One significant category of
https://doi.org/10.1016/j.ssci.2017.11.006
Received 3 February 2017; Received in revised form 23 September 2017; Accepted 5 November 2017
⁎
Corresponding author.
E-mail address: khalilian@kiau.ac.ir (M. Khalilian).
Safety Science xxx (xxxx) xxx–xxx
0925-7535/ © 2017 Elsevier Ltd. All rights reserved.
Please cite this article as: Kaeeni, S., Safety Science (2017), http://dx.doi.org/10.1016/j.ssci.2017.11.006