0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2924906, IEEE Transactions on Vehicular Technology 1 Improving Driver Identification for the Next Generation of In-vehicle Software Systems Abdellah El Mekki, Afaf Bouhoute and Ismail Berrada Abstract—This paper deals with driver identification and fingerprinting and its application for enhanced driver profiling and car security in connected cars. We introduce a new driver identification model based on collected data from smartphone sensors, and/or the OBD-II protocol, using Convolutional Neural Networks (CNN), and Recurrent Neural Networks (Long Short- Term Memory) RNN/LSTM. Unlike the existing works, we use a cross-validation technique that provides reproducible results when applied on unseen realistic data. We also studied the robustness of the model to sensor data anomalies. The obtained results show that our model accuracy remains acceptable even when the rate of the anomalies increases substantially. Finally, the proposed model was tested on different datasets and implemented in Automotive Grade Linux Framework, as a real-time anti-theft and driver profiling system. Index Terms—Driver Identification, Time Series, Neural Net- works, CNN, RNN/LSTM, Automotive Grade Linux, Anomaly Detection. I. I NTRODUCTION The continued evolution of computer and communication technologies, during the last decades, has mainly contributed to making cars smart and more connected. Besides their mechanical components, modern cars are equipped with more than 50 computer systems that control various functions including vehicle control, safety features and infotainment. Fueled by such evolution of the car architecture, automotive manufacturers started integrating new technologies such as WIFI, GPS navigation, 3G/4G connectivity in cars that en- able the deployment of innovative products and cloud-based services. With this change in cars manufacturing strategies, the automotive industry is moving from product-based to service- based economy, where all players (car makers, communication and software companies) are interacting to provide better services for the connected car. In this new ecosystem, the driver is connected to his car (through his personal devices) and all driving related services: insurance service [23], credit cards payment, manufacturer services [27] (e.g. predictive vehicle maintenance), driver online accounts, traffic informa- tion, optimal routes (in terms of distance, fuel consumption), software updates, data analysis, etc. Copyright (c) 2015 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. A. El Mekki is with the Mohammed VI Polytechnic University, Bengrir, and the Sidi Mohamed Ben Abdellah University, Laboratory of Informatics, Modeling and Systems Department of Computer science, Faculty of science, B.P. 1796 Fez-Atlas, 30003, Morocco. A. Bouhoute is with Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, F-33400, Talence, France. I. Berrada is with the Sidi Mohamed Ben Abdellah University, Laboratory of Informatics, Modeling and Systems, Department of Computer science, Faculty of science, B.P. 1796 Fez-Atlas, 30003, Morocco. A modern vehicle can have many in-built and mobile sensors (in some cases, more than 100 sensors), and multiple options to digitally connect to drivers, roads and other vehicles. It collects and generates a tremendous amount and a variety of data to provide on-board services to consumers. The generated data includes information about the vehicle as well as the driver and can be accessed using many standardized ways. As one of the most popular methods, the OBD-II protocol provides data from the vehicle ECUs (Electronic Control Unit). Lately, thanks to the evolution and widespread use of smartphones, low-cost smartphones have been used to collect GPS information, speed, acceleration, etc. Furthermore, smart- phones can now be fully integrated into the vehicle thanks to automotive operating systems like Android Auto [24], Qnx Automotive OS [26], Automotive Grade Linux (AGL) [25]. As technology and data science continue to evolve, vehicles will get more autonomous features and will be highly adapt- able to their drivers [1]. The data gathered from connected vehicles will help researchers and manufacturers to build a more comfortable driving experience by developing some per- sonalized driver profiles. However, the continuous connectivity of connected cars raises security concerns, as it makes the car vulnerable to hacking and thefts [4]. To tackle this issue, driver identification researches have emerged to enhance driver profiling and car security. Driver identification is of a significant importance in the context of connected cars as well as for emerging paradigms such as shared mobility and car sharing paradigms. In a con- nected car, driver identification can be used for infotainment applications by providing personalized recommendations, or for car security by detecting unusual drivers preventing thus car theft and hacking. In a shared mobility context, an ac- curate identification of the car operator may help car sharing providers in detecting unauthorized accesses, preventing dan- ger caused by bad driving as well as providing personalized ride for their consumers. Driver identification may also be used to enable innovative services for insurance companies to identify and charge higher premiums to insurers that lent their cars to other drivers, or traffic controllers to confirm the identity of drivers responsible of traffic laws violation. This paper deals with driver identification and fingerprinting using driver behavior data [13]. The main contribution of this paper consists of considering driver personal data as time series, and driver identification as a multivariate time series classification problem. Considering the uniqueness of driver behavior, our objective is to identify the driver of a given unseen driving data, using a deep learning model. To this end, we propose a complete end-to-end framework for driver