(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 12, 2022 830 | Page www.ijacsa.thesai.org Utilizing Deep Learning in Arabic Text Classification Sentiment Analysis of Twitter Nehad M. Ibrahim 1 , Wael M.S. Yafooz 2 , Abdel-Hamid M. Emara 3 , Ahmed Abdel-Wahab 4 Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia 1 Department of Computer Science-College of Computer Science and Engineering, Taibah University, Medina 42353, Saudi Arabia 2 Department of Computers and Systems Engineering-Faculty of Engineering, Al-Azhar University, Cairo 11884, Egypt 3 Arab Open University, Riyadh, Saudi Arabia 4 Abstract—The number of social media users has increased. These users share and reshare their ideas in posts and this information can be mined and used by decision-makers in different domains, who analyse and study user opinions on social media networks to improve the quality of products or study specific phenomena. During the COVID-19 pandemic, social media was used to make decisions to limit the spread of the disease using sentiment analysis. Substantial research on this topic has been done; however, there are limited Arabic textual resources on social media. This has resulted in fewer quality sentiment analyses on Arabic texts. This study proposes a model for Arabic sentiment analysis using a Twitter dataset and deep learning models with Arabic word embedding. It uses the supervised deep learning algorithms on the proposed dataset. The dataset contains 51,000 tweets, of which 8,820 are classified as positive, 37,360 neutral, and 8,820 as negative. After cleaning it will contain 31,413. The experiment has been carried out by applying the deep learning models, Convolutional Neural Network and Long Short-Term Memory while comparing the results of different machine learning techniques such as Naive Bayes and Support Vector Machine. The accuracy of the AraBERT model is 0.92% when applying the test on 3,505 tweets. Keywords—Arabic sentiment analysis; machine learning; convolutional neural networks; word embedding; Arabic word2Vec; long short-term method; AraBERT I. INTRODUCTION Recently, sentiment analysis has been prioritized by researchers because it plays an important role in many domains. It is primarily used to study user feedback (user opinion) on a specific event, product or social phenomenon. Many studies have proposed models, approaches or novel databases to predicate and detect user opinions. These methods use machine learning classifiers, deep learning models and natural language techniques as pre-processing methods. Most of the sentiment analysis research focuses on languages other than Arabic. Recent Natural Language Processing research is now increasingly focused on using deep neural learning [1]. Some research initiatives are being launched in a competition funded by the King Abdullah University of Science and Technology (KAUST). They focus on the Arabic language and some individual research efforts. Generally, in other languages, specifically English, the universal language has proven to be significant due to the vast amount of data contributed by users on social networks (Facebook, Twitter, etc.). In machine learning, a classification known as supervised learning is used in sentiment analysis. There are several methods used in sentiment analysis which can be categorized into binary classification, multi- classification, polarity, multilingual and aspect-based sentiment analysis. In binary classification, the classes can be represented only as positive and negative. In multi-class, there are more than two classes. Additionally, there are classifiers used in binary classification such as DT and TH, while KNN and LR are used in multi-classification. Polarity in sentiment analysis is based on a dictionary that assigns a score to each word. Multilingual sentiment analysis requires many pre-processing steps to be performed in option detection and aspect-based. It is focused on one aspect, concept or word. To the best of our scholarly knowledge, less attention has been given to Arabic sentiment analysis and there are fewer public Arabic datasets [2]. Therefore, this paper proposes a model for Arabic sentiment analysis based on the proposed dataset. This work uses supervised deep learning algorithms. The original dataset before the cleaning process contains 51,000 tweets classified as 8,820 positive, 37,360 neutral and 8,820 negative. After cleaning, it contains 31,413 tweets classified as 4,855 positive, 21,842 neutral and 4,716 negative. This work introduces and applies deep learning methods on Arabic sentiment analysis text multi-classes with parameter optimization, and improves the process in the text pre- processing area. We apply the deep learning methods Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) to compare the results of different supervised machine learning techniques such as Naive Bayes (NB) and Support Vector Machine (SVM). The accuracy of the best CNN model is 95.8% and the accuracy of LSTM is 96.6%, which are better than the SVM and NB results, which are 82.5% and 69.4%, respectively. We used BERT pre-trained specifically in Arabic to achieve the same success that BERT achieved in English [3]. Based on a review of the literature and the high accuracy achieved in the deep learning models, the main contributions of this paper can be summarized as follows.