LogicGAN–based Data Augmentation Approach to Improve Adversarial Attack DNN Classifiers Christophe Feltus Luxembourg Institute of Science and Technology (LIST) Maison De L’Innovation, Avenue des Hauts–Fourneaux 5, L–4362 Esch/Alzette, Luxembourg christophe.feltus@list.lu Abstract—This paper presents an innovative algorithmic ap- proach in order to improve adversarial attack classifiers, based on data augmented by minor modifications generated by a logicGAN. Therefore, the paper addresses a particular type of mitigation against adversarial attacks, which consists of training the ”attacked” classifier with initial and adversarial data already known by the defender. Accordingly, we propose an algorithm that improves the training of the classifier: (1) by generating complementary adversarial data which instead of coming from the known adversarial attack, comes directly from minor modifications resulting from the already known adversarial data, and (2) by generating these minor modifications using a specific kind of generative adversarial network named logicGAN. By using an xAI system, this derivative of GAN has the particularity of yielding more substantial corrective feedback from the discriminator to the generator and, thereby, making the mitigation of adversarial attacks faster. Index Terms—Adversarial attack, LogicGAN, Data augmenta- tion, Generative adversarial network, Security, Attack classifier, Adversarial sample, DNN. I. I NTRODUCTION The contribution of artificial intelligence (AI) and machine learning (ML) to the security of the information system (IS) constitutes an important area of concern for companies. A security infrastructure can no longer be deployed and up- dated without using artificial intelligence or machine–learning models to continuously update the security tools combating new threats that appear. Ten years ago, many existing security applications were based on multi–layer perceptron (MLP) networks. At that time, these networks were not suitable for processing the raw situation data generated in real–time by the environment. Nowadays, the topologies of the MLP are more sophisticated [1] and have been developed to support the analysis of visual imagery (convolutional neural networks), the recurrent analysis of time series (recurrent neural networks), learning in an unknown environment (reinforcement learning [2], [3]) or the implicit generative models (generative adver- sarial networks – GAN). On the other side of the coin, the development of AI and ML also largely benefits the attackers who redouble their ingenuity to take advantage of the strong potential of AI to implement new attacks. This is especially the case for adversarial attacks, which consist of designing the input of a DNN (deep neural network) classifier in a specific way so that it outputs a wrong result [4]. This specific input, able to deceive the classifier, is known as an adversarial example. According to Frosst 1 , an adversarial example is an image intentionally craft to damage up a network after training it. Figure 1 illustrates a classical case of an adversarial example. In [5], a mitigation of adversarial attacks is proposed and consists of training a Defense–GAN to model the distribution of unperturbed data. At the time of inference, the Defense- GAN finds an output close to a given piece of data which does not contain the adversarial changes. This output is then used to improve the training of the classifier to be protected. The problem with GANs (including the Defense–GAN) is that they are very resource–intensive [6], and that there is only one abstract value of corrective feedback provided by the discriminator to the generator. Recent researches works have proposed logicGAN [6], which advances the state of the art in field of corrective feedback by modifying the gradient descent using a dedicated xAI system. This system aims to explain the motivation of the classification achieved by the discriminator and thereby, because of a richer interpretation, supports the generator in order to trick the discriminator more effectively. Acknowledging (1) the need to enhance the training of ad- versarial attack DNN classifiers with complementary data, and, (2) the difficulty of training a GAN for this purpose, in this paper, we present an innovative algorithmic approach to improve adversarial attack classifiers based on data augmented by minor modifications generated by a logicGAN. Therefore, the paper addresses a particular type of mitigation against adversarial attacks, which consists of training the ”attacked” classifier with initial and adversarial data already known by the defender. Accordingly, we propose an algorithm that improves the training of the classifier (1) by generating complementary adversarial data which instead of coming from the known adversarial attack, comes directly from minor mod- ifications resulting from the adversarial data already known, and (2) by generating these minor modifications using a spe- cific kind of generative adversarial network named logicGAN [6]. By using an xAI system, this GAN derivative has the particularity of being able to yield more substantial corrective feedback from the discriminator to the generator and, thereby, enabling a faster mitigation of adversarial attacks. This paper is structured as follows: in Section II, we present 1 Nicholas Frosst: Google Brain Research Engineer working on the adver- sarial examples problem with Turing Award winner Geoffrey Hinton’s Toronto team.