768 Available Online at www.ijscia.com | Volume 3 | Issue 5 | Sep-Oct 2022 A Multimodal Hate Speech Classification Process Using Dual Feature Extraction Techniques Chibuike Onuoha 1 , Ikerionwu Charles 2 and Obi Nwokonkwo 1 1 Department of Information Technology, Federal University of Technology, Owerri, Nigeria 2 Department of Software Engineering, Federal University of Technology, Owerri, Nigeria E-mail: onuoha.chibuike@futo.edu.ng; charles.ikerionwu@futo.edu.ng; obi.nwokonkwo@futo.edu.ng *Corresponding author details: Chibuike Onuoha; onuoha.chibuike@futo.edu.ng ABSTRACT Racist and ethnic violence, fabricated persecution, and some form of intimidation are all risks associated with hate speech, which is a concern with natural language processing. Given the sensitivity of hate speech in our society, it is essential to classify speeches into hate and non-hate categories in real time to minimize its risks. The main objective of this work is to investigate selected supervised machine learning algorithm model for the classification of hate speech on social media. The term frequency-inverse document frequency (TF-IDF) and bag of words (BOW) models were used by the model to extract features. Porter's stemming model and WordNet for lemmatization are used during the preprocessing step. The datasets were trained using logistic regression, naive Bayes, and random forest, and logistic regression was also utilized to create the classifier. For training purpose, 80% of the datasets was used to train the model and 20% was used for testing the model. Results obtained from the application of Logistic Regression algorithm revealed 98% accuracy and 98% F1-score. These scores indicate high accuracy in hate speech detection and classification. Keywords: NLP; hate speech; classification; accuracy INTRODUCTION Although social media is a key component that has aided social engineering in the 21st century, it is not without disadvantages. Recently, nations on the receiving end have risen to police social media, purposely to easily detect hate speech and comments considered offensive. For example, hate speech has been propagated through social media and used to incite the populace against established authority. According to [1], hate speech is an offensive language that could be aggressive, insulting, provocative etc., targeted at a person or group of people. To raise the awareness and nip the propagation of hate speech at bud, social media platforms have requested their respective users to shun such acts. Because of the varieties of hate speech witnessed from different societies, [2] opined that there is no general acceptable concept of hate speech. Although hate speech is a controversial concept, what is considered a hate speech in a specific environment might not be seen as such in another environment. Hate is an indication of an emotional state or opinion, and therefore distinct from any manifested action. Speech: any expression imparting opinions or ideas– bringing a subjective opinion or idea to an external audience [3]. It can take many forms: written, non-verbal, visual, or artistic, and can be disseminated through any media, including the internet, print, radio, or television. Based on the two highlighted component of the term ‘Hate Speech’, Hate speech is an expression of hate towards a person or group of people. With regards to this definition, hate speech in the context of this project work is an opinion or idea that is emotional, directed to an individual or group. FIGURE 1: Text classification Pipeline Process It could originate from different forms namely written, artistic or visual and could be distributed by different sources such as television, Internet, etc. Since hate speech is recent and has seen a surge in application, its detection is giving rise to recent research interest [4]. Manual detection of hateful texts is a tedious work and as such filled with a lot of biasness which is the reason researchers are finding automated ways to detect hate speech on the web. Due to the fact that speeches follow a natural language processing classification problem, there exist complications in terms of grammar and sentence structure involve with online media communities. International Journal of Scientific Advances ISSN: 2708-7972 Volume: 3 | Issue: 5 | Sep - Oct 2022 Available Online: www.ijscia.com DOI: 10.51542/ijscia.v3i5.16 *