Optimal deep transfer learning based ethnicity recognition
on face images
Marwa Obayya
a
, Saud S. Alotaibi
b
, Sami Dhahb
c,d
, Rana Alabdan
e
, Mesfer Al Duhayyim
f
,
Manar Ahmed Hamza
g,
⁎, Mohammed Rizwanullah
g
, Abdelwahed Motwakel
g
a
Department of Biomedical Engineering, College of Engineering, Princess Nourah bint Abdulrahman University, P.O.Box 84428, Riyadh 11671, Saudi Arabia
b
Department of Information Systems, College of Computing and Information System, Umm Al-Qura University, Saudi Arabia
c
Department of Computer Science, College of Science & Art at Mahayil, King Khalid University, Saudi Arabia
d
University of Tunis EL Manar, Higher Institute of Computer, Research Team on Intelligent Systems in Imaging and Artificial Vision (SIIVA) – Lab LIMTIC, Aryanah 2036, Tunisia
e
Department of Information Technology, College of Computer and Information Science, Majmaah University, Al-Majmaah 11952, Saudi Arabia
f
Department of Computer Science, College of Sciences and Humanities- Aflaj, Prince Sattam bin Abdulaziz University, Saudi Arabia
g
Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, AlKharj, Saudi Arabia
abstract article info
Article history:
Received 31 March 2022
Received in revised form 30 September 2022
Accepted 23 October 2022
Available online 29 October 2022
In recent times, deep learning driven face image analysis has gained significant interest among several applica-
tion areas like surveillance, security, biometrics, etc. The facial analysis intends to compute facial soft biometrics
like ethnicity, expression, identification, age, gender, and so on. Among several biometrics, ethnicity recognition
remains a hot research area. Recent advancements in computer vision (CV) and artificial intelligence (AI) models
form the basis of an effective design of ethnicity recognition models. With this motivation, this paper introduces a
novel Harris Hawks optimization with deep transfer learning based fusion model for face ethnicity recognition
(HHODTLF-FER) model. The proposed HHODTLF-FER model is to determine the different kinds of ethnicity for
applied facial images. A fusion of three pre-trained DL models, namely VGG16, Inception v3, and capsule net-
works (CapsNet) models, are employed. In addition, bidirectional long short term memory (BiLSTM) model is ap-
plied for ethnicity recognition and Classification. Finally, HHO algorithm is utilized to fine tune the
hyperparameters contained in the BiLSTM model, showing the novelty of the work. In order to ensure the im-
proved recognition performance of the HHODTLF-FER model, a wide ranging experimental analysis is performed
using benchmark databases. The comprehensive comparative study highlighted the promising performance of
the HHODTLF-FER model over the other approaches.
© 2022 Elsevier B.V. All rights reserved.
Keywords:
Ethnicity recognition
Face images
Deep learning
Fusion model
Hyperparameter tuning
Face recognition
1. Introduction
The face is one of the parts of the human body that consists mostly
the semantic information regarding a person the commonly named fa-
cial soft biometrics, such as ethnicity, gender, age, identity, and expres-
sions, have allured in recent times the interest of the pattern
authorization communities thank a note to the greater amount of prob-
able application in retailing and video surveillance and to the innate
hardship of designing proficient and relevant algorithmic program in
the challenge facing real-world outlines [1]. Currently, surveillance sys-
tems are grants actually to secure the public. The advancement of artifi-
cial intelligence, specifically AI for computer vision (CV), has made it
completely simple for analyzing the end result videos [2]. Various re-
searches have currently met out the issue of event detection in video
surveillance that needs a capability for identification and localization
of stated spatiotemporal designs. Another major issue in surveillance
video analysis, that attacks much research attention, is the individual
re-identification trouble [3]. An individual re-identification defines the
job of recognizing an individual beyond various photos which has
been taken via many cameras or by using single camera [4].
Despite, ethnicity recognition (ER), that is the capacity of systems for
determining whether a person matches one of the ethnicity category
corresponding to facial appearances observation like skin color, mor-
phology, and other definite pattern, has not acquired the same kind of
interest from the scientific communities [5]. The attentiveness for ER
is definitely mounting, concerning new methodologies and datasets
have been currently offered to enhance the accuracy level of real appli-
cation recently obtaining a performance biased through ethnicity (face
detecting and recognizing, gender categorization, age calculation) or
for providing an ultimate push to application in forensic (ethnicity)
[6]. The shortage of ethnicity data is primarily because of innate
Image and Vision Computing 128 (2022) 104584
⁎ Corresponding author.
E-mail address: ma.hamza@psau.edu.sa (M.A. Hamza).
https://doi.org/10.1016/j.imavis.2022.104584
0262-8856/© 2022 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Image and Vision Computing
journal homepage: www.elsevier.com/locate/imavis