ARTICLE IN PRESS
JID: NEUCOM [m5G;December 11, 2019;22:20]
Neurocomputing xxx (xxxx) xxx
Contents lists available at ScienceDirect
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Adversarial attacks on Faster R-CNN object detector
Yutong Wang
a,b
, Kunfeng Wang
c,∗
, Zhanxing Zhu
d
, Fei-Yue Wang
a
a
The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
b
University of Chinese Academy of Sciences, Beijing 100049, China
c
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
d
School of Mathematical Sciences, Peking University, Beijing 100871, China
a r t i c l e i n f o
Article history:
Received 29 September 2019
Revised 12 November 2019
Accepted 22 November 2019
Available online xxx
Communicated by Dr. Nianyin Zeng
Keywords:
Adversarial attack
Object detection
White-box attack
Black-box attack
a b s t r a c t
Adversarial attacks have stimulated research interests in the field of deep learning security. However,
most of existing adversarial attack methods are developed on classification. In this paper, we use Pro-
jected Gradient Descent (PGD), the strongest first-order attack method on classification, to produce ad-
versarial examples on the total loss of Faster R-CNN object detector. Compared with the state-of-the-
art Dense Adversary Generation (DAG) method, our attack is more efficient and more powerful in both
white-box and black-box attack settings, and is applicable in a variety of neural network architectures. On
Pascal VOC2007, under white-box attack, DAG has 5.92% mAP on Faster R-CNN with VGG16 backbone us-
ing 41.42 iterations on average, while our method achieves 0.90% using only 4 iterations. We also analyze
the difference of attacks between classification and detection, and find that in addition to misclassifica-
tion, adversarial examples on detection also lead to mis-localization. Besides, we validate the adversarial
effectiveness of both Region Proposal Network (RPN) and Fast R-CNN loss, the components of the total
loss. Our research will provide inspiration for further efforts in adversarial attacks on other vision tasks.
© 2019 Elsevier B.V. All rights reserved.
1. Introduction
Since deep neural networks [1] have recently achieved state-
of-the-art performance on speech and visual recognition tasks, re-
searchers are used to applying neural networks to solve important
practical problems, such as image classification [2,3], object detec-
tion [4–8], face recognition [9], natural language processing [10],
biochemical analysis [11], and malware detection [12]. However,
with the increasingly wide use of neural networks, the incentives
for adversaries to attack neural networks also increase.
Many prior works have demonstrated that convolutional neu-
ral networks (CNNs) are vulnerable to so-called adversarial
examples [13,14]. These adversarial examples are intentionally-
perturbed inputs that will be misclassified by CNNs, but not by hu-
man observers. Thus, this problem reveals the weakness of CNNs
and draws attention of many researchers. Not only vulnerable to
adversarial examples which are directly fed in, CNNs are also vul-
nerable to adversarial examples captured by cameras [15]. For ex-
ample, in the physical world, a person wearing well-crafted printed
eyeglass frames can be recognized as another person [16]. This
method successfully attacks Face Recognition System (FRS) and
∗
Corresponding author.
E-mail addresses: wangkf@mail.buct.edu.cn, kunfeng.wang@ia.ac.cn (K. Wang).
Face Detection System, making people worry about the security of
applications based on FRS. Likewise, perturbations are crafted to
mislead traffic sign recognition systems to recognize a stop sign as
a speed limit sign [17], which can be very dangerous when these
unsafe traffic sign recognition systems are applied to autonomous
driving.
Although many researches have been devoted to classification,
only a few have been done on object detection, which is a more
complex topic [18,19]. The practical application scenarios of CNNs
always include a variety of objects, so it would be more meaningful
and more practical to study adversarial attacks on object detection.
Compared with classification, object detection often needs to gen-
erate region proposals before classification. And the procedure of
classifying each object proposal is similar to common classification.
In this way, adversarial attack on object detection can be translated
into a problem of generating adversarial examples over a set of
proposals [20]. So how about applying attack methods from classi-
fication to object detection?
Considering the similarity of classification and detection, we
use Projected Gradient Descent (PGD) [21], the strongest first-order
attack method on classification, to generate adversarial examples
on Faster R-CNN [18] with VGG16 [22] and ResNet101 [23] back-
bone structures. The generated adversarial examples are illustrated
in Fig. 1. Our method achieves higher success rate than the state-
of-the-art Dense Adversary Generation (DAG) method [20], and
https://doi.org/10.1016/j.neucom.2019.11.051
0925-2312/© 2019 Elsevier B.V. All rights reserved.
Please cite this article as: Y. Wang, K. Wang and Z. Zhu et al., Adversarial attacks on faster R-CNN object detector, Neurocomputing,
https://doi.org/10.1016/j.neucom.2019.11.051