Pattern Recognition 109 (2021) 107579
Contents lists available at ScienceDirect
Pattern Recognition
journal homepage: www.elsevier.com/locate/patcog
Iterative local re-ranking with attribute guided synthesis for face
sketch recognition
Decheng Liu
a
, Xinbo Gao
a,d,∗
, Nannan Wang
b,∗
, Chunlei Peng
c
, Jie Li
a
a
State Key Laboratory of Integrated Services Networks, School of Electronic Engineering, Xidian University, Xi’an 710071, Shaanxi, P. R. China
b
State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi’an 710071, Shaanxi, P. R. China
c
State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, Xidian University, Xi’an 710071, Shaanxi, P. R. China
d
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
a r t i c l e i n f o
Article history:
Received 2 August 2019
Revised 25 March 2020
Accepted 4 August 2020
Available online 5 August 2020
Keywords:
Face sketch recognition
Facial attribute
Re-ranking
a b s t r a c t
Because of the large texture and spatial structure discrepancies between face sketches and photos, face
sketch recognition becomes a challenging problem in face recognition community. For example, in law
enforcement and security, the specific face sketch generation process could introduce some inevitable
biases which results in poor face sketch recognition performance. In order to mimic the modality gap
introduced by the biases during face sketch creation process, the novel iterative local re-ranking with
attribute guided synthesis method is proposed for face sketch recognition, which does not require any
extra manually annotation or human interaction. The clues of face attributes are utilized to generate im-
ages with varying local characteristic from probe sketches, which could help eliminate the unavoidable
biases. Considering the special property of face sketches, the iterative local re-ranking algorithm is de-
signed to encode the contextual information integrated with local invariant discriminative information
for matching sketches with photos. Experimental results on multiple face sketch databases demonstrate
that the proposed method achieves superior performances compared with state-of-the-art methods.
© 2020 Elsevier Ltd. All rights reserved.
1. Introduction
Face recognition is a challenging and important application in
computer vision. Recently great progress has been achieved, how-
ever there still exist many challenging scenarios in the real world.
Especially in the law enforcement agency, there exist many scenes
where the mug shot of the suspect is not available or only poor-
quality images are captured in video surveillance. Due to the lack
of suspects’ photographs, law enforcement agencies have started
to generate face sketches according to the description provided by
eyewitness or blurry surveillance videos. With technological im-
provement, more forensic artists have begun to utilize the gener-
ation software to produce composite sketches as the placement of
hand-drawn sketches. However, it is due to the complex and spe-
cial generation process of face sketches, there always exist shape
exaggerations and distortions in face sketches. More importantly,
in law enforcement face sketches are utilized to determine the
identity of criminals where only the description of eyewitnesses
∗
Corresponding authors.
E-mail addresses: dcliu.xidian@gmail.com (D. Liu), gaoxb@cqupt.edu.cn
(X. Gao), nnwang@xidian.edu.cn (N. Wang), clpeng@xidian.edu.cn (C. Peng),
leejie@mail.xidian.edu.cn (J. Li).
is available. Forensic psychology related works [1] prove that face
sketch recognition is affected by forgotten memory of eyewit-
ness and imperfect communication. In summary, the sketch-photo
modality difference, the inaccurate memory of eyewitnesses and
the biased communication of memory all bring about unavoid-
able biases in the generated face sketches. Thus, face sketch recog-
nition remains a difficult and challenging task in the real-world
scenario.
Existing face sketch recognition methods mostly focus on three
aspects: 1) extracting modality invariant features [2–4], which con-
tain the identity discriminative information; 2) projecting different
modality face images into a latent common space [5–8], where face
sketches and gallery photos could be matched directly; 3) trans-
forming images in one modality to another modality [9–12], which
would make these images in homogeneous scenarios, and then
the traditional homogeneous face recognition methods could be
directly utilized. However, face sketches in real-world scenes dif-
fer from gallery photos because of the avoidable perceptual bias,
descriptive bias and generating bias [13]. These inevitable biases
caused in the generation procedures could enlarge the gap of dif-
ferent modalities. Only transforming sketches and photos in homo-
geneous scenarios is not enough to eliminate these biases when
matching face sketches. To mimic the modality gap mentioned be-
https://doi.org/10.1016/j.patcog.2020.107579
0031-3203/© 2020 Elsevier Ltd. All rights reserved.