Training dataset construction for anomaly detection in face
anti-spoofing
L. Abduh and I. Ivrissimtzis
Durham University, Department of Computer Science, UK
Abstract
Anomaly detection, which is approaching the problem of face anti-spoofing as a one-class classification problem, is emerging
as an increasingly popular alternative to the traditional approach of training binary classifiers on specialized anti-spoofing
databases which contain both client and imposter samples. In this paper, we discuss the training protocols in the existing work
on anomaly detection for face anti-spoofing, and note that they use images exclusively from specialized anti-spoofing databases,
even though only common images of real faces are needed.
In a proof-of-concept experiment, we demonstrate the potential benefits of adding in the anomaly detection training sets images
from general face recognition, rather than specialised face anti-spoofing, databases, or images from the in-the-wild images.
We train a convolutional autoencoder on real faces and compare the reconstruction error against a threshold to classify a face
image as either client or imposter. Our results show that the inclusion in the training set of in-the-wild images increases the
discriminating power of the classifier on an unseen database, as evidenced by an increase in the value of the Area Under the
Curve.
CCS Concepts
• Computing methodologies → Computer vision tasks; Image manipulation;
1. Introduction
Face liveness tests authenticate users of face recognition systems
by processing input images and deciding whether they come from
a human face or, for example, from printed photos held in front of
the system’s camera by an imposter. The main challenge for de-
veloping a robust face anti-spoofing system is the large number of
different types of presentation attacks the system must learn to rec-
ognize. For example, an imposter could be presenting to the face
recognition system a printed photo, a screen displaying a still im-
age, or a screen replaying a video. A multitude of other factors,
such as the quality of the printed photo, the resolution and type
of the displaying screen, the illumination conditions of the scene,
and the characteristics of the system’s camera, may also have a sig-
nificant effect on the performance of any anti-spoofing algorithm.
Moreover, a robust anti-spoofing algorithm should be able to cope
with previously unseen attack methods, which were not anticipated
prior to its deployment.
Traditionally, face anti-spoofing is approached as a binary classi-
fication problem and classifiers are trained on specialised datasets,
containing both client and imposter images and videos. The main
limitation of this approach is associated with the high cost of cre-
ating such databases. That is, a limited only number of attacks is
simulated, on a limited number of subjects, while the variability
of important environmental factors such as illumination conditions
and background is also limited. As a result, the classifiers do not
always generalize well to previously unseen attacks.
In this context, anomaly detection, using classifiers trained on a
one class dataset of client images only, is becoming an increas-
ingly popular approach to face anti-spoofing [AKC17][AK18].
The present work is motivated by the observation that training with
client images only can also use in-the-wild face images, that is, a
set of face images harvested online, as well as face images from
databases that do not specialize in face-anti-spoofing.
After giving a brief overview of the general literature on face
anti-spoofing, in Section 2.2 we review the relevant literature on
the use of anomaly detection for face anti-spoofing and establish
our main observation. That is, in the existing literature, the training
data are drawn from specialised face anti-spoofing databases, even
though they are just common face images.
In Sections 3 and 4, we describe a proof-of-concept experiment
on the feasibility of an alternative approach to the creation of one-
class training sets. In particular, we augment an initial training set
of client images from specialised face anti-spoofing databases, first
with images from non-specialised databases, the SCFace [GDG11]
and the CASIA-Web Face [YLLL14] in particular, and then with
images from the in-the-wild, which were semi-automatically har-
vested from online sources.
© 2021 The Author(s)
Eurographics Proceedings © 2021 The Eurographics Association.
DOI: 10.2312/cgvc.20211312
https://diglib.eg.org
https://www.eg.org
EG UK Computer Graphics & Visual Computing (2021)
K. Xu and M. Turner (Editors)