Profiling Haters on Twitter using Statistical and
Contextualized Embeddings
Notebook for PAN at CLEF 2021
Hamed Babaei Giglou
1
, Taher Rahgooy
2
, Jafar Razmara
1
, Mostafa Rahgouy
3
and
Zahra Rahgooy
4
1
Department of Computer Science, University of Tabriz, Tabriz, Iran
2
Department of Computer Science, University of West Florida, Florida, USA
3
Part AI Research Center, Tehran, Iran
4
Department of Computer Science, Damghan University, Semnan, Iran
Abstract
Hate Speech (HS) in social media such as Twitter is a complex phenomenon that attracted a signifcant
body of research in the NLP. HS Spreaders (haters) aim to spread HS via social media. In this task, we
aim to identify such haters. On one hand, our proposed class-dependent LDSE representation is fed to a
linear SVM classifer to identify the haters based on general commonalities. On the other hand, stylistic
features of individuals are captured by using extractive summarization of the tweets in conjunction with
RoBERTa embedding before classifying them using another linear SVM classifer. Experimental results
expressed as accuracies 0.67 and 0.80 over English and Spanish test sets respectively show efcacy of
our approach in identifying the haters across diferent languages.
Keywords
Hate Speech, Abusive Language, Author Profling, Stylistic Features, Word Embedding
1. Introduction
The social medial platform enables millions to publicly share user-generted content. Regardless
of diferent content types, a critical point of these platforms, such as Twitter, Facebook, YouTube,
and Instagram, is that users can discuss content. Unfortunately, any user engaging online will
always facing the risk of being targeted or harassed via abusive language, hatred expressed in
the form of racism or sexism, with possible impact on his/her and the community in general.
The challenge of creating efective policies to identify and appropriately respond to harassment
is compounded by the difculty of studying the phenomena at scale. Hate speech is commonly
defned as any communication that disparages a person or a group on the basis of some
characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or
other characteristics.
CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania
h.babaei98@ms.tabrizu.ac.ir (H. Babaei Giglou); trahgooy@students.uwf.edu (T. Rahgooy);
razmara@tabrizu.ac.ir (J. Razmara); mostafa.rahgouy@partdp.ai (M. Rahgouy); za.rahgooy@gmail.com
(Z. Rahgooy)
https://hamedbabaei.github.io/ (H. Babaei Giglou); http://www.rahgooy.com/ (T. Rahgooy)
0000-0002-2889-0522 (T. Rahgooy)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)