Received May 22, 2021, accepted May 26, 2021, date of publication June 3, 2021, date of current version June 14, 2021. Digital Object Identifier 10.1109/ACCESS.2021.3085875 All Your Fake Detector Are Belong to Us: Evaluating Adversarial Robustness of Fake-News Detectors Under Black-Box Settings HASSAN ALI 1 , MUHAMMAD SULEMAN KHAN 2 , AMER ALGHADHBAN 3 , (Member, IEEE), MESHARI ALAZMI 4 , AHMAD ALZAMIL 3 , KHALED AL-UTAIBI 5 , (Senior Member, IEEE), AND JUNAID QADIR 6 , (Senior Member, IEEE) 1 IHSAN Lab, Information Technology University, Lahore 54600, Pakistan 2 Department of Computer Science, Information Technology University (ITU), Lahore 54600, Pakistan 3 Department of Electrical Engineering, College of Engineering, University of Ha’il, Ha’il 81451, Saudi Arabia 4 Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il 81451, Saudi Arabia 5 Department of Computer Engineering, College of Computer Science and Engineering, University of Ha’il, Ha’il 81451, Saudi Arabia 6 Department of Electrical Engineering, Information Technology University (ITU), Lahore 54600, Pakistan Corresponding author: Junaid Qadir (junaid.qadir@itu.edu.pk) This work was supported by the Deputy for Research and Innovation, Ministry of Education through the Initiative of Institutional Funding at University of Ha’il-Saudi Arabia under Project IFP-2004. ABSTRACT With the hyperconnectivity and ubiquity of the Internet, the fake news problem now presents a greater threat than ever before. One promising solution for countering this threat is to leverage deep learning (DL)-based text classification methods for fake-news detection. However, since such methods have been shown to be vulnerable to adversarial attacks, the integrity and security of DL-based fake news classifiers are under question. Although many works study text classification under the adversarial threat, to the best of our knowledge, we do not find any work in literature that specifically analyzes the performance of DL-based fake-news detectors under adversarial settings. We bridge this gap by evaluating the performance of fake-news detectors under various configurations under black-box settings. In particular, we investigate the robustness of four different DL architectural choices—multilayer perceptron (MLP), convolutional neural network (CNN), recurrent neural network (RNN) and a recently proposed Hybrid CNN-RNN trained on three different state-of-the-art datasets—under different adversarial attacks (Text Bugger, Text Fooler, PWWS, and Deep Word Bug) implemented using the state-of-the-art NLP attack library, Text-Attack. Additionally, we explore how changing the detector complexity, the input sequence length, and the training loss affect the robustness of the learned model. Our experiments suggest that RNNs are robust as compared to other architectures. Further, we show that increasing the input sequence length generally increases the detector’s robustness. Our evaluations provide key insights to robustify fake-news detectors against adversarial attacks. INDEX TERMS Fake news detection, deep neural networks, adversarial attacks, adversarial robustness. I. INTRODUCTION Recent advances in information and communication technology including the rise of social media, artificial intel- ligence (AI), computational bots, and ubiquitous connectiv- ity has resulted in an information ecosystem that is awash with low-quality, partisan, or even outright fake-news [1]. Advances such as deep learning and generative adversarial networks (GANs) have made it easy for any motivated entity The associate editor coordinating the review of this manuscript and approving it for publication was Mehedi Masud. to create fake-news and use it for large-scale opinion manip- ulation [2]. One such example is the US 2016 presidential elections where fake-news generated for personal gains were believed and shared by 37 million Facebook users [1], [3]. The future well-being of our society is contingent on com- bating the fake-news malaise effectively. In recent times, the use of AI and machine learning (ML) have been pro- posed for developing algorithmic fake-news detectors that can flag false information. In particular, researchers are lever- aging deep neural networks (DNNs)-based text-classification methods to meet the fake-news detection challenge [1]–[4]. 81678 This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 9, 2021