ARTICLE IN PRESS JID: NEUCOM [m5G;December 11, 2019;22:20] Neurocomputing xxx (xxxx) xxx Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Adversarial attacks on Faster R-CNN object detector Yutong Wang a,b , Kunfeng Wang c, , Zhanxing Zhu d , Fei-Yue Wang a a The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China b University of Chinese Academy of Sciences, Beijing 100049, China c College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China d School of Mathematical Sciences, Peking University, Beijing 100871, China a r t i c l e i n f o Article history: Received 29 September 2019 Revised 12 November 2019 Accepted 22 November 2019 Available online xxx Communicated by Dr. Nianyin Zeng Keywords: Adversarial attack Object detection White-box attack Black-box attack a b s t r a c t Adversarial attacks have stimulated research interests in the field of deep learning security. However, most of existing adversarial attack methods are developed on classification. In this paper, we use Pro- jected Gradient Descent (PGD), the strongest first-order attack method on classification, to produce ad- versarial examples on the total loss of Faster R-CNN object detector. Compared with the state-of-the- art Dense Adversary Generation (DAG) method, our attack is more efficient and more powerful in both white-box and black-box attack settings, and is applicable in a variety of neural network architectures. On Pascal VOC2007, under white-box attack, DAG has 5.92% mAP on Faster R-CNN with VGG16 backbone us- ing 41.42 iterations on average, while our method achieves 0.90% using only 4 iterations. We also analyze the difference of attacks between classification and detection, and find that in addition to misclassifica- tion, adversarial examples on detection also lead to mis-localization. Besides, we validate the adversarial effectiveness of both Region Proposal Network (RPN) and Fast R-CNN loss, the components of the total loss. Our research will provide inspiration for further efforts in adversarial attacks on other vision tasks. © 2019 Elsevier B.V. All rights reserved. 1. Introduction Since deep neural networks [1] have recently achieved state- of-the-art performance on speech and visual recognition tasks, re- searchers are used to applying neural networks to solve important practical problems, such as image classification [2,3], object detec- tion [4–8], face recognition [9], natural language processing [10], biochemical analysis [11], and malware detection [12]. However, with the increasingly wide use of neural networks, the incentives for adversaries to attack neural networks also increase. Many prior works have demonstrated that convolutional neu- ral networks (CNNs) are vulnerable to so-called adversarial examples [13,14]. These adversarial examples are intentionally- perturbed inputs that will be misclassified by CNNs, but not by hu- man observers. Thus, this problem reveals the weakness of CNNs and draws attention of many researchers. Not only vulnerable to adversarial examples which are directly fed in, CNNs are also vul- nerable to adversarial examples captured by cameras [15]. For ex- ample, in the physical world, a person wearing well-crafted printed eyeglass frames can be recognized as another person [16]. This method successfully attacks Face Recognition System (FRS) and Corresponding author. E-mail addresses: wangkf@mail.buct.edu.cn, kunfeng.wang@ia.ac.cn (K. Wang). Face Detection System, making people worry about the security of applications based on FRS. Likewise, perturbations are crafted to mislead traffic sign recognition systems to recognize a stop sign as a speed limit sign [17], which can be very dangerous when these unsafe traffic sign recognition systems are applied to autonomous driving. Although many researches have been devoted to classification, only a few have been done on object detection, which is a more complex topic [18,19]. The practical application scenarios of CNNs always include a variety of objects, so it would be more meaningful and more practical to study adversarial attacks on object detection. Compared with classification, object detection often needs to gen- erate region proposals before classification. And the procedure of classifying each object proposal is similar to common classification. In this way, adversarial attack on object detection can be translated into a problem of generating adversarial examples over a set of proposals [20]. So how about applying attack methods from classi- fication to object detection? Considering the similarity of classification and detection, we use Projected Gradient Descent (PGD) [21], the strongest first-order attack method on classification, to generate adversarial examples on Faster R-CNN [18] with VGG16 [22] and ResNet101 [23] back- bone structures. The generated adversarial examples are illustrated in Fig. 1. Our method achieves higher success rate than the state- of-the-art Dense Adversary Generation (DAG) method [20], and https://doi.org/10.1016/j.neucom.2019.11.051 0925-2312/© 2019 Elsevier B.V. All rights reserved. Please cite this article as: Y. Wang, K. Wang and Z. Zhu et al., Adversarial attacks on faster R-CNN object detector, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.11.051