Foreground Extraction Based Facial Emotion Recognition Using Deep Learning Xception Model Alwin Poulose 1 , Chinthala Sreya Reddy 2,3 , Jung Hwan Kim 1 and Dong Seog Han 1,* 1 School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, Republic of Korea 2 School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea 3 Department of Computer Science, CHRIST University, Bangalore, India alwinpoulosepalatty@knu.ac.kr 1 , sreyareddy2000@gmail.com 2,3 , jkim267@knu.ac.kr 1 , dshan@knu.ac.kr 1* Abstract —The facial emotion recognition (FER) sys- tem has a very signiﬁcant role in the autonomous driving system (ADS). In ADS, the FER system iden- tiﬁes the driver’s emotions and provides the current driver’s mental status for safe driving. The driver’s mental status determines the safety of the vehicle and prevents the chances of road accidents. In FER, the system identiﬁes the driver’s emotions such as happy, sad, angry, surprise, disgust, fear, and neutral. To identify these emotions, the FER system needs to train with large FER datasets and the system’s performance completely depends on the type of the FER dataset used in the model training. The recent FER system uses publicly available datasets such as FER 2013, ex- tended Cohn-Kanade (CK+), AﬀectNet, JAFFE, etc. for model training. However, the model trained with these datasets has some major ﬂaws when the system tries to extract the FER features from the datasets. To address the feature extraction problem in the FER sys- tem, in this paper, we propose a foreground extraction technique to identify the user emotions. The proposed foreground extraction-based FER approach accurately extracts the FER features and the deep learning model used in the system eﬀectively utilizes these features for model training. The model training with our FER approach shows accurate classiﬁcation results than the conventional FER approach. To validate our proposed FER approach, we collected user emotions from 9 people and used the Xception architecture as the deep learning model. From the FER experiment and result analysis, the proposed foreground extraction-based ap- proach reduces the classiﬁcation error that exists in the conventional FER approach. The FER results from the proposed approach show a 3.33% model accuracy improvement than the conventional FER approach. Index Terms—Facial emotion recognition (FER), au- tonomous driving system (ADS), deep convolutional neural networks (DCNNs), Foreground Extraction. I. Introduction In recent days, the facial emotion recognition (FER) system plays a major role in autonomous driving systems (ADS) for safe driving [1]. In ADS, the FER system provides the driver’s mental conditions based on his/her emotions and the driver’s emotions are useful information to reduce vehicle collision and road accidents. In FER, the system conveys the emotional state of a driver from his/her facial expressions and it determines the driver’s mental health [2]. The result from the FER system is an inﬂuencing factor to determine the performance of ADS systems. In ADS, it is necessary to monitor the driver’s emotions for safe driving. In recent years, many researchers proposed various techniques for emotion recognition and these techniques achieve remarkable FER performance for autonomous driving applications [3][4][5][6][7]. However, in most of the FER applications, the existing FER systems use publicly available datasets such as FER 2013 [8], extended Cohn-Kanade (CK+) [9], AﬀectNet [10], JAFFE [11], etc. for model training, and the model trained with these dataset has feature extraction issues for real-time im- plementation. The raw facial images increase the compu- tational time of the FER system and it is necessary to per- form the data preprocessing before the system uses a deep learning model for training. The lack of the features from the existing dataset creates a classiﬁcation error and this error directly reﬂects the FER system’s performance. To reduce the classiﬁcation error and feature extraction issues that exist in the FER systems, we propose a foreground extraction-based FER approach, and this approach adds a high level of feature information to the FER dataset. Our experiment result and analysis show that the proposed foreground extraction-based FER approach reduces the classiﬁcation error and predicts the user emotions with a high level of model accuracy and minimum loss error. In this paper, we proposed a FER approach that pre- dicts the current user/driver emotions. The proposed FER approach introduces a foreground extraction technique [12] for FER datasets and the dataset after foreground extraction conveys useful information for model training. The foreground extraction technique increases the feature information and the deep learning model can easily classify the user emotions with minimum error. To validate our proposed FER approach, we created a dataset with the emotions of 9 people based on the foreground extraction technique. The proposed system uses the Xception archi- tecture [13] as the deep learning model and trained this model with our FER dataset. The FER results from our approach show that the Xception model is able to predict 356 978-1-7281-6476-2/21/$31.00 ©2021 IEEE ICUFN 2021