NLytics at CheckThat! 2021: Detecting Previously Fact-Checked Claims by Measuring Semantic Similarity Albert Pritzkau 1 1 Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, Fraunhoferstraße 20, 53343 Wachtberg, Germany Abstract The following system description presents our approach to the detection of previously fact-checked claims. Given a claim originating from a tweet or a political debate, we specifed the similarity to a collection of previously fact-checked claims. In line with the origin of the claims, the collection of previously fact-checked claims is composed of tweets and political debates respectively. The given task has been framed as a sequence similarity problem. Relevance scoring is based on semantic similarity. Similarity is calculated by distance metrics on representation vectors at paragraph level. Keywords Information Retrieval, Semantic Similarity, Deep Learning, Transformers, RoBERTa 1. Introduction Social networks provide opportunities to conduct disinformation campaigns for organizations as well as individual actors. The proliferation of disinformation online, has given rise to a lot of research on automatic fake news detection. CLEF 2021 - CheckThat! Lab [1, 2] considers disinformation as a communication phenomenon. By detecting the use various claims in (political) communication, it takes into account not only the content but also how a subject matter is communicated by specifc actors, in particular, by repetition of the same claims. Task definition: Detect Previously Fact-Checked Claims Given a check-worthy claim, and a set of previously fact-checked claims, determine whether the claim has been previously fact-checked. Based on the source of the considered claims the shared task [3] defnes the following subtasks both of which are framed as ranking tasks: • Subtask A: Detect Previously Fact-Checked Claims in Tweets • Subtask B: Detect Previously Fact-Checked Claims in Political Debates/Speeches CLEF 2021 ś Conference and Labs of the Evaluation Forum, September 21ś24, 2021, Bucharest, Romania albert.pritzkau@fkie.fraunhofer.de (A. Pritzkau) https://www.fkie.fraunhofer.de/ (A. Pritzkau) 0000-0001-7985-0822 (A. Pritzkau) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)