NLytics at CheckThat! 2021: Detecting Previously
Fact-Checked Claims by Measuring Semantic
Similarity
Albert Pritzkau
1
1
Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, Fraunhoferstraße 20, 53343
Wachtberg, Germany
Abstract
The following system description presents our approach to the detection of previously fact-checked
claims. Given a claim originating from a tweet or a political debate, we specifed the similarity to a
collection of previously fact-checked claims. In line with the origin of the claims, the collection of
previously fact-checked claims is composed of tweets and political debates respectively. The given task
has been framed as a sequence similarity problem. Relevance scoring is based on semantic similarity.
Similarity is calculated by distance metrics on representation vectors at paragraph level.
Keywords
Information Retrieval, Semantic Similarity, Deep Learning, Transformers, RoBERTa
1. Introduction
Social networks provide opportunities to conduct disinformation campaigns for organizations
as well as individual actors. The proliferation of disinformation online, has given rise to a lot
of research on automatic fake news detection. CLEF 2021 - CheckThat! Lab [1, 2] considers
disinformation as a communication phenomenon. By detecting the use various claims in
(political) communication, it takes into account not only the content but also how a subject
matter is communicated by specifc actors, in particular, by repetition of the same claims.
Task definition: Detect Previously Fact-Checked Claims Given a check-worthy claim,
and a set of previously fact-checked claims, determine whether the claim has been previously
fact-checked. Based on the source of the considered claims the shared task [3] defnes the
following subtasks both of which are framed as ranking tasks:
• Subtask A: Detect Previously Fact-Checked Claims in Tweets
• Subtask B: Detect Previously Fact-Checked Claims in Political Debates/Speeches
CLEF 2021 ś Conference and Labs of the Evaluation Forum, September 21ś24, 2021, Bucharest, Romania
albert.pritzkau@fkie.fraunhofer.de (A. Pritzkau)
https://www.fkie.fraunhofer.de/ (A. Pritzkau)
0000-0001-7985-0822 (A. Pritzkau)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)