A Deep Transfer Learning Approach for Fake News Detection Tanik Saikh * , Haripriya B , Asif Ekbal * and Pushpak Bhattacharyya * Department of Computer Science and Engineering, Indian Institute of Technology Patna * Bihta, Patna, India * Department of Computer Science and Engineering, Indian Institute of Information Technology Senapati Manipur, India Email: {tanik.srf17,asif,pb}@iitp.ac.in * , haripriya@iiitmanipur.ac.in Abstract—Fake or incorrect or miss-information detection has nowadays attracted attention to the researchers and developers because of the huge information overloaded in the web. This problem can be considered as equivalent to lie detection, truth- fulness identification or stance detection. In our particular work, we focus on deciding whether the title of a news is consistent with its body text- a problem equivalent to fake information identification. In this paper, we propose a deep transfer learning approach where the problem of detecting title-body consistency is posed from the viewpoint of Textual Entailment (TE) where the title is considered as a hypothesis and news body is treated as a premise. The idea is to decide whether the body infers the title or not. Evaluation on the existing benchmark datasets, namely Fake News Challenge (FNC) dataset (released in Fake News Challenge Stage 1 (FNC-I): Stance Detection) show the efficacy of our proposed approach in comparison to the state-of- the-art systems. Index Terms—Text Entailment, Title-Body Consistency, Stance Detection, Fake News, Deep Transfer Learning I. I NTRODUCTION The online platforms like social media websites, e- commerce sites of products and services, blogs, online forums and discussion forums etc. are very much attached today with our day-to-day lives. A large volume of textual contents are generated daily from these sources. This information can be effectively utilized to build models for any applications related to Artificial Intelligence (AI), Natural Language Processing (NLP) and Machine Learning (ML). As the sources are diverse in nature and large in number, studying the credibility and authenticity of information is a crucial step. There are multiple reports available for a particular event and the vice-versa. Different news agencies also produce different reports on a particular topic. A legitimate report should be consistent with its title. In order to judge the truthfulness of a particular event/claim it is necessary to observe what other agencies are saying on that particular topic. So to justify the truthfulness of a particular fact/claim, it is necessary to judge the consistency between the fact/claim and the body texts related to that very topic. This could be a vital module for robust fake news detection system - the task of which is to identify whether the news is genuine or fake. In this paper, we tackle the problem of fake news detection through stance detection. Stance is basically an article’s re- sponse to a title/headline/claim. The response could be in any of the followings: Agree, Disagree, Discuss and Unrelated. It is one of the fundamental approaches for fake news detection. We make use of this stance detection to combat fake news. We can detect fake information/claims through stance as follows: suppose a person claims like ”Barak Obama is not born in United States” or ”The pope has a new baby”, we can take that claim and search for many news articles with respect to that subject. If we have many reputable (and well-known) sources which all Agree with this claim, then we can say that the particular claim is most probably true. More concisely, the task can be defined as: Given a claim and a body of text (like news article), the system has to decide whether the body of the text generally Agree, disagree, neutral or is completely Unrelated to the claim. This problem is typically called a stance detection problem. Hence, detecting the truthfulness of a particular information/claim, and the consistency of that particular information with its’ context is a challenging task for fake news detection. According to [1] fake news is ”made-up stories with an intention to deceive”. Basically, the task of fake news detection is to estimate the probability of a piece of text being fake. The problem of fake information detection has been viewed from the different perspectives, viz. (i). determining whether the textual content of a news article is true or not, and (ii). evaluating the intrinsic prejudice of a written text. In our current work, we use the setup which consists of News Title (NT), News Body (NB) and their stance (relation). We make use of the benchmark dataset which is released as a part of the Fake News Challenge [2] 1 . We pose the task of stance detection as equivalent to consistency detection between the NT and the NB which is conceptually very similar to a very popular task in NLP, namely Natural Language Inference (NLI) [3] or Textual Entailment (TE) [4], [5]. The definition is as follows: Given two pieces of texts, one being the Premise(P) and the another one is the Hypothesis(H), the system has to decide whether H is the logical consequence of P or not. H is true in every circumstance (possible world) in which P is true. 1 http://www.fakenewschallenge.org/ 978-1-7281-6926-2/20/$31.00 ©2020 IEEE