Contents lists available at ScienceDirect Safety Science journal homepage: www.elsevier.com/locate/safety An optimization-based decision tree approach for predicting slip-trip-fall accidents at work Sobhan Sarkar a, , Rahul Raj b , Sammangi Vinay c , J. Maiti a , Dilip Kumar Pratihar c a Department of Industrial & Systems Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302, India b Department of Electrical Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302, India c Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302, India ARTICLE INFO Keywords: Machine learning Accident prediction Decision tree algorithms Optimization Safety decision rule generation ABSTRACT Slip-trip-fall (STF) accident is one of the leading causes of injuries. Therefore, prediction of STF is necessary prior to its occurrence at workplaces. Although there exist a number of studies analysing STFs, machine learning (ML)- based approaches for both predicting STF and analysing its factors remain an unexplored area of research. Therefore, the aim of the study is to develop a novel methodology for prediction of STF occurrences using decision tree (DT) classiers, namely C5.0, classication and regression tree (CART) and random forest (RF). The parameters of the classiers are optimized using two state-of-the-art optimization algorithms, namely particle swarm optimization (PSO), and genetic algorithm (GA) for enhanced prediction accuracy. Experimental results reveal that PSO-RF algorithm produces the best accuracy as compared to others. Finally, the proposed method generates a set of 20 interpretable safety decision rules explaining the factors behind the occurrences of STFs. 1. Introduction Safety is an important issue in occupation. Due to the presence of hazardous elements at the workplace, workers are usually exposed to the occupational risk. About 2.3 million workers were killed due to occupational accidents per year including nearly 360 thousands of fatal accidents (Sánchez et al., 2011). The main cause of accidents is either unsafe acts or conditions or both. Slip-trip-falls (STFs) have been re- cognised as the prime cause of occupational injuries. For example, STF accounts for about 2040% of the total occupational injuries in the USA, UK, and Sweden (Courtney et al., 2001; Nenonen, 2013; Yoon and Lockhart, 2006). According to a Finnish study, nearly 30% of all acci- dents at work are related to STF. Further, accidents caused by STF deeply impact the economy of a country. For example, in the USA, the estimated direct annual cost due to injuries related to STF is 6 billion dollar (Courtney et al., 2001). Even in Finland, this gure increases to 400 million Euros. Similarly, large enterprises across the world are suering from signicant STF-related injuries. A number of factors are found to be responsible for the STF-related accidents at work. These factors include individual, environmental task, equipment and location factors, or their combinations (Bentley, 2009; Redfern et al., 2001). In particular, the factors like footwear, underfoot conditions, and gait patterns have been identied as the major contributors to the STF-related accidents (Gao et al., 2008). According to Gao et al. (2008) and Courtney et al. (2001), the factors, such as low friction and slipperiness or loose grip between underfoot surface and footwear are considered as the primary risk factors. Out of them, only the slipperiness condition leads to 4050% of injuries related to falls. Other than slipperiness, there exist a set of several other factors inu- encing STF-related accidents, such as human activity, fatigue, ageing, hazard perception, and so forth (Bentley, 2009; Gao et al., 2008). However, these data related to ergonomics have certain limitations, such as they are (i) micro in nature, (ii) dicult to be captured, and (iii) costly and uncomfortable for the workers during collection. Therefore, surrogate data should be used to get the broader pattern for engineering and intervention. Industry level data collection is necessary at this stage for the trade-obetween ergonomics and engineering variables. Here, engineering variables imply the variables or attributes captured by the industry for their own purpose. These data are generated in various stages at the plant level of safety management process and usually stored in the electronic database of the respective industry. If these data are properly analysed to extract the meaningful information or knowledge in terms of the patterns, it is then possible to predict the occurrence of accidents more accurately and consequently, many causal factors behind the accident can be explored. In addition to the predic- tion, analysis of the factors contributing towards accidents is also https://doi.org/10.1016/j.ssci.2019.05.009 Received 2 January 2019; Received in revised form 22 March 2019; Accepted 6 May 2019 Corresponding author. E-mail addresses: sobhan.sarkar@gmail.com (S. Sarkar), rahul361raj@gmail.com (R. Raj), sammangi.vinay@gmail.com (S. Vinay), jhareswar.maiti@hotmail.com (J. Maiti), dkpra@mech.iitkgp.ac.in (D.K. Pratihar). Safety Science 118 (2019) 57–69 0925-7535/ © 2019 Elsevier Ltd. All rights reserved. T