International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 08 Issue: 06 | June 2021 www.irjet.net p-ISSN: 2395-0072
© 2021, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 4579
Physically Realizable Adversarial Attacks and Defenses – A Review
Dhruv Behl
1
, Dr. B Sathish Babu
1
1
Department of Computer Science and Engineering, R V College of Engineering, Bengaluru, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Deep neural networks have become the
approach of choice in a multitude of domains, especially
computer vision related tasks like image classification,
localization and segmentation. However, numerous
demonstrations have shown that deep neural networks may
be easily deceived by precisely perturbing pixels in an image,
which is commonly referred to as an adversarial attack. As a
result, a considerable amount of literature has evolved on
defending deep neural networks against adversarial
examples, with approaches for learning more robust neural
network models or detecting malicious inputs being
proposed. Oddly, while considerable attention has been
devoted to defending against adversarial perturbation
attacks in the digital space, there are no effective methods
specifically to defend against such physically-realizable
attacks. We study the problem of defending deep neural
network approaches for image classification from physically
realizable attacks. First, we demonstrate all the physically
realizable attacks that have come up recently and tabulate
their attack performance on different datasets. Then, we
discuss the existing defenses against physical attacks, their
robustness, and their shortcomings. Finally, we discuss the
challenges faced by most of the current defenses and present
future research perspectives needed to achieve true
adversarial robustness.
Key Words: Adversarial Robustness, Convolutional
Neural Networks, Deep Learning, Image Classification,
Safety, Security.
1.INTRODUCTION
Computer Vision is the study of how computers can
extract high-level information from digital photographs or
films. Various categorization challenges make up a large
part of the field of Computer Vision. The task of providing
a label to an image is known as image classification.
The approaches for handling classification problems have
been widely researched in both academic and commercial
businesses as a core challenge in computer vision and
machine learning, and significant progress has been made.
Convolutional neural networks (CNNs) are the most
popular picture classification algorithms, with better-than-
human performance on a variety of benchmark datasets,
while their real-world performance across new
institutions and curated collections is still unknown. Fig 1
shows a demonstration of a CNN being used for the image
classification task.
Fig 1. CNN being used for Image Classification.
State-of-the-art effectiveness of deep neural networks has
made it the technique of choice in a variety of fields,
including computer vision, natural language processing
and speech recognition. However, there have been a
myriad of demonstrations showing that deep neural
networks can be easily fooled by carefully perturbing
pixels in an image through what have become known as
adversarial example attacks. In response, a large literature
has emerged on defending deep neural networks against
adversarial examples, typically either proposing
techniques for learning more robust neural network
models, or by detecting adversarial inputs.
Adversarial examples are inputs to machine learning
models that an attacker has intentionally designed to
cause the (trained) model to make a mistake [1].
Adversarial images appear visually and semantically the
same to us, but the model ends up predicting the wrong
class with very high confidence, which is worrying. Fig 2.
gives an example of an adversarial image created to fool
the classifier [2].
Fig 2. Adversarial image being used to fool the classifier.
The size of the perturbation is in the core of the
adversarial attack, a small perturbation is the fundamental
premise of such models. When designing an adversarial
example, the attacker wants the perturbed input to be as
close as possible to the original one, in the case of images,
close enough that a human can not distinguish one image
from the other.
• Perturbation Scope: The attacker can generate
perturbations that are input specific, in which we call
individual, or it can generate a single perturbation which