A Deep Transfer Learning Approach for Fake News
Detection
Tanik Saikh
*
, Haripriya B
†
, Asif Ekbal
*
and Pushpak Bhattacharyya
*
Department of Computer Science and Engineering, Indian Institute of Technology Patna
*
Bihta, Patna, India
*
Department of Computer Science and Engineering, Indian Institute of Information Technology Senapati
†
Manipur, India
†
Email: {tanik.srf17,asif,pb}@iitp.ac.in
*
, haripriya@iiitmanipur.ac.in
†
Abstract—Fake or incorrect or miss-information detection has
nowadays attracted attention to the researchers and developers
because of the huge information overloaded in the web. This
problem can be considered as equivalent to lie detection, truth-
fulness identification or stance detection. In our particular work,
we focus on deciding whether the title of a news is consistent
with its body text- a problem equivalent to fake information
identification. In this paper, we propose a deep transfer learning
approach where the problem of detecting title-body consistency
is posed from the viewpoint of Textual Entailment (TE) where
the title is considered as a hypothesis and news body is treated
as a premise. The idea is to decide whether the body infers
the title or not. Evaluation on the existing benchmark datasets,
namely Fake News Challenge (FNC) dataset (released in Fake
News Challenge Stage 1 (FNC-I): Stance Detection) show the
efficacy of our proposed approach in comparison to the state-of-
the-art systems.
Index Terms—Text Entailment, Title-Body Consistency, Stance
Detection, Fake News, Deep Transfer Learning
I. I NTRODUCTION
The online platforms like social media websites, e-
commerce sites of products and services, blogs, online forums
and discussion forums etc. are very much attached today with
our day-to-day lives. A large volume of textual contents are
generated daily from these sources. This information can be
effectively utilized to build models for any applications related
to Artificial Intelligence (AI), Natural Language Processing
(NLP) and Machine Learning (ML). As the sources are diverse
in nature and large in number, studying the credibility and
authenticity of information is a crucial step. There are multiple
reports available for a particular event and the vice-versa.
Different news agencies also produce different reports on a
particular topic. A legitimate report should be consistent with
its title. In order to judge the truthfulness of a particular
event/claim it is necessary to observe what other agencies are
saying on that particular topic. So to justify the truthfulness of
a particular fact/claim, it is necessary to judge the consistency
between the fact/claim and the body texts related to that very
topic. This could be a vital module for robust fake news
detection system - the task of which is to identify whether
the news is genuine or fake.
In this paper, we tackle the problem of fake news detection
through stance detection. Stance is basically an article’s re-
sponse to a title/headline/claim. The response could be in any
of the followings: Agree, Disagree, Discuss and Unrelated. It
is one of the fundamental approaches for fake news detection.
We make use of this stance detection to combat fake news. We
can detect fake information/claims through stance as follows:
suppose a person claims like ”Barak Obama is not born in
United States” or ”The pope has a new baby”, we can take
that claim and search for many news articles with respect
to that subject. If we have many reputable (and well-known)
sources which all Agree with this claim, then we can say that
the particular claim is most probably true.
More concisely, the task can be defined as: Given a claim
and a body of text (like news article), the system has to decide
whether the body of the text generally Agree, disagree,
neutral or is completely Unrelated to the claim. This problem
is typically called a stance detection problem. Hence, detecting
the truthfulness of a particular information/claim, and the
consistency of that particular information with its’ context is
a challenging task for fake news detection. According to [1]
fake news is ”made-up stories with an intention to deceive”.
Basically, the task of fake news detection is to estimate the
probability of a piece of text being fake. The problem of fake
information detection has been viewed from the different
perspectives, viz. (i). determining whether the textual content
of a news article is true or not, and (ii). evaluating the
intrinsic prejudice of a written text. In our current work,
we use the setup which consists of News Title (NT), News
Body (NB) and their stance (relation). We make use of the
benchmark dataset which is released as a part of the Fake
News Challenge [2]
1
.
We pose the task of stance detection as equivalent to
consistency detection between the NT and the NB which
is conceptually very similar to a very popular task in NLP,
namely Natural Language Inference (NLI) [3] or Textual
Entailment (TE) [4], [5]. The definition is as follows: Given
two pieces of texts, one being the Premise(P) and the another
one is the Hypothesis(H), the system has to decide whether
• H is the logical consequence of P or not.
• H is true in every circumstance (possible world) in which
P is true.
1
http://www.fakenewschallenge.org/
978-1-7281-6926-2/20/$31.00 ©2020 IEEE