Road Safety Performance Function Analysis With Visual Feature Importance of Deep Neural Nets Guangyuan Pan, Liping Fu, Qili Chen, Ming Yu, and Matthew Muresan Abstract—Road safety performance function (SPF) analysis us- ing data-driven and nonparametric methods, especially recent de- veloped deep learning approaches, has gained increasing achieve- ments. However, due to the learning mechanisms are hidden in a “black box” in deep learning, traffic features extraction and intel- ligent importance analysis are still unsolved and hard to generate. This paper focuses on this problem using a deciphered version of deep neural networks (DNN), one of the most popular deep learning models. This approach builds on visualization, feature importance and sensitivity analysis, can evaluate the contribu- tions of input variables on model’s “black box” feature learning process and output decision. Firstly, a visual feature importance (ViFI) method that describes the importance of input features is proposed by adopting diagram and numerical-analysis. Secondly, by observing the change of weights using ViFI on unsupervised training and fine-tuning of DNN, the final contributions of input features are calculated according to importance equations for both steps that we proposed. Sequentially, a case study based on a road SPF analysis is demonstrated, using data collected from a major Canadian highway, Highway 401. The proposed method allows effective deciphering of the model’s inner workings and al- lows the significant features to be identified and the bad features to be eliminated. Finally, the revised dataset is used in crash mod- eling and vehicle collision prediction, and the testing result veri- fies that the deciphered and revised model achieves state-of-the- art performance. Index Terms—Deep learning, deep neural network (DNN), feature importance, road safety performance function. I. Introduction E VALUATING safety effects of countermeasures relies greatly on collision prediction models or safety perform- ance functions (SPF), which is an important topic in road safety studies. Safety performance functions are commonly developed separately for different types of highways or entit- ies and locally using data collected from the study area repres- enting the specific highway types to be modelled. Tradition- ally, they are well reflected in the highway safety manual (HSM), in which several example SPFs for various types of highways and intersections from different jurisdictions are documented [1], [2]. Moreover, one of the most commonly used methods is called parametric modeling (e.g., negative bi- nomial model, NB), and it requires a series of trial and error process before arriving at the final model structure with a set of significant variables [3]–[5]. Although this model is easy to understand and apply, the predicted results have low ac- curacies due to the random nature of collision occurrences and the strong distribution assumption. Another technique that has been studying on is called non-parametric modeling (e.g., ker- nel regression (KR); support vector modeling (SVM); artifi- cial neural networks (ANN)), and it has achieved satisfying prediction accuracy [6]–[9]. However, road safety perform- ance function parameters in this method cannot be quantified and is hence difficult to generalize. The recent developed technology of artificial intelligence (AI) brings in new solution potentials for this problem. Artificial intelligence has revolutionized many industries already, bringing in new ideas and exciting technologies. It has also brought changes to nearly every scientific field, and many more advancements remain. Among the most notable techniques developed, deep learning (also called deep neural networks), is often considered as one of the most remarkable [10]–[12]. Since its proposal, it has been successfully applied to solve complex problems in a variety of fields, including but not limited to pattern recognition, game theory, computer vision, medical treatment, transportation logistics and financial [13]–[18]. In our previous research, we have applied deep belief network, one of the most popular deep learning models, in establishing SPF and the trained model has outperformed the traditional methods [19], [20]. However, despite the seemingly endless benefits deep learning brings, it possesses a certain opacity and darkness that often cause doubt and resistance from policy makers and scientists. Some findings highlight the fact that although deep learning models are trained to solve tasks based on human knowledge, the models see those objects differently than humans. As a result, these findings have made AI not totally trusted for scientists and industry applications [21], [22]. One of the biggest reasons that leads to the findings is the unanalyzable of black box training process problem. To address these limitations, researchers have begun studying on defending, strengthening Manuscript received November 4, 2019; revised January 11, 2020; accep- ted February 21, 2020. This work was supported by the National Science and Engineering Research Council of Canada (NSERC), Ontario Research Fund – Research Excellence (ORF-RE), the Ministry of Transportation Ontario (MTO) through Its Highway Infrastructure Innovation Funding Program (HII- FP), Beijing Postdoctoral Science Foundation (ZZ-2019-65), Beijing Chaoy- ang District Postdoctoral Science Foundation (2019ZZ-45), and Beijing Mu- nicipal Education Commission (KM201811232016). Recommended by Asso- ciate Editor Lingxi Li. (Corresponding author: Liping Fu.) Citation: G. Y. Pan, L. P. Fu, Q. L. Chen, M. Yu, and M. Muresan, “Road safety performance function analysis with visual feature importance of deep neural nets,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 3, pp. 735–744, May 2020. G. Y. Pan, L. P. Fu, and M. Muresan are with the Department of Civil and Environmental Engineering, University of Waterloo, ON N2L5G1, Canada (e-mail: garrypan0512@gmail.com; lfu@uwaterloo.ca; mimuresa@uwater- loo.ca). Q. L. Chen is with the Faculty of Information Technology, Beijing Uni- versity of Technology, Beijing 100124, China (e-mail: qilichen@hotmail. com). M. Yu is with the Electronic Engineering, the Chinese University of Hong Kong, Hong Kong 999077, China (e-mail: ming.yu@ee.cuhk.edu.hk). Color versions of one or more of the figures in this paper are available on- line at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JAS.2020.1003108 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 7, NO. 3, MAY 2020 735