Foreground Extraction Based Facial Emotion
Recognition Using Deep Learning Xception Model
Alwin Poulose
1
, Chinthala Sreya Reddy
2,3
, Jung Hwan Kim
1
and Dong Seog Han
1,*
1
School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, Republic of Korea
2
School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea
3
Department of Computer Science, CHRIST University, Bangalore, India
alwinpoulosepalatty@knu.ac.kr
1
, sreyareddy2000@gmail.com
2,3
, jkim267@knu.ac.kr
1
, dshan@knu.ac.kr
1*
Abstract —The facial emotion recognition (FER) sys-
tem has a very significant role in the autonomous
driving system (ADS). In ADS, the FER system iden-
tifies the driver’s emotions and provides the current
driver’s mental status for safe driving. The driver’s
mental status determines the safety of the vehicle and
prevents the chances of road accidents. In FER, the
system identifies the driver’s emotions such as happy,
sad, angry, surprise, disgust, fear, and neutral. To
identify these emotions, the FER system needs to train
with large FER datasets and the system’s performance
completely depends on the type of the FER dataset
used in the model training. The recent FER system
uses publicly available datasets such as FER 2013, ex-
tended Cohn-Kanade (CK+), AffectNet, JAFFE, etc.
for model training. However, the model trained with
these datasets has some major flaws when the system
tries to extract the FER features from the datasets. To
address the feature extraction problem in the FER sys-
tem, in this paper, we propose a foreground extraction
technique to identify the user emotions. The proposed
foreground extraction-based FER approach accurately
extracts the FER features and the deep learning model
used in the system effectively utilizes these features
for model training. The model training with our FER
approach shows accurate classification results than the
conventional FER approach. To validate our proposed
FER approach, we collected user emotions from 9
people and used the Xception architecture as the deep
learning model. From the FER experiment and result
analysis, the proposed foreground extraction-based ap-
proach reduces the classification error that exists in
the conventional FER approach. The FER results from
the proposed approach show a 3.33% model accuracy
improvement than the conventional FER approach.
Index Terms—Facial emotion recognition (FER), au-
tonomous driving system (ADS), deep convolutional
neural networks (DCNNs), Foreground Extraction.
I. Introduction
In recent days, the facial emotion recognition (FER)
system plays a major role in autonomous driving systems
(ADS) for safe driving [1]. In ADS, the FER system
provides the driver’s mental conditions based on his/her
emotions and the driver’s emotions are useful information
to reduce vehicle collision and road accidents. In FER,
the system conveys the emotional state of a driver from
his/her facial expressions and it determines the driver’s
mental health [2]. The result from the FER system is an
influencing factor to determine the performance of ADS
systems. In ADS, it is necessary to monitor the driver’s
emotions for safe driving. In recent years, many researchers
proposed various techniques for emotion recognition and
these techniques achieve remarkable FER performance for
autonomous driving applications [3][4][5][6][7]. However, in
most of the FER applications, the existing FER systems
use publicly available datasets such as FER 2013 [8],
extended Cohn-Kanade (CK+) [9], AffectNet [10], JAFFE
[11], etc. for model training, and the model trained with
these dataset has feature extraction issues for real-time im-
plementation. The raw facial images increase the compu-
tational time of the FER system and it is necessary to per-
form the data preprocessing before the system uses a deep
learning model for training. The lack of the features from
the existing dataset creates a classification error and this
error directly reflects the FER system’s performance. To
reduce the classification error and feature extraction issues
that exist in the FER systems, we propose a foreground
extraction-based FER approach, and this approach adds a
high level of feature information to the FER dataset. Our
experiment result and analysis show that the proposed
foreground extraction-based FER approach reduces the
classification error and predicts the user emotions with a
high level of model accuracy and minimum loss error.
In this paper, we proposed a FER approach that pre-
dicts the current user/driver emotions. The proposed FER
approach introduces a foreground extraction technique
[12] for FER datasets and the dataset after foreground
extraction conveys useful information for model training.
The foreground extraction technique increases the feature
information and the deep learning model can easily classify
the user emotions with minimum error. To validate our
proposed FER approach, we created a dataset with the
emotions of 9 people based on the foreground extraction
technique. The proposed system uses the Xception archi-
tecture [13] as the deep learning model and trained this
model with our FER dataset. The FER results from our
approach show that the Xception model is able to predict
356 978-1-7281-6476-2/21/$31.00 ©2021 IEEE ICUFN 2021