International Journal of Computer Science Trends and Technology (IJCST) – Volume 8 Issue 3, May-Jun 2020 ISSN: 2347-8578 www.ijcstjournal.org Page 69 Applying Sentiment Analysis on Arabic comments in Sudanese Dialect Islam Saif Aldin Mukhtar Heamida [1] , EL Samani Abd Elmutalib Ahmed [2] , Mamdoh Noureldin Elsayed Mohamed [3] , Abd Alhameed Adam Ahmed Salih [4] Faculty of Computer Science and information technology, University of Alneelain Faculty of Mathematical Sciences, Khartoum University Sudan ABSTRACT This paper addresses the problem of sentiment classification for the Sudanese text on Facebook using machine learning methods. One of the features of the Sudanese language is the use of a variety of endings depending on deviation, times, and grammatical gender .The Sudanese colloquial have no grammatical or morphological rules, for example, in their slang, they enter the present tense on the word, another common problem in sentiment classification for different languages is that a single word could refer to different meanings, for example, the word "Salem "can refer to a name and also an adjective. Our task was to evaluate how the processing steps and lemmatization libraries that we used affected a Sudanese colloquial text. Two different classifiers were applied; SVM & Naïve Bayes NB to classify comments based on their polarity, whether positive, negative, or neutral. The work was evaluated with four different measures. The results revealed that the use of SVM with lemmatization libraries improves the accuracy of sentiment classification; SVM achieved the best measurement accuracy of 68.6%, while NB achieved 63.1%. Keywords :- Opinion Mining، Sentiment Analysis, Sudanese Colloquial Dialect. I. INTRODUCTION Sentiment analysis is a NLP method [1] that is implemented on a text to determine whether the author's intentions toward a particular topic or product or so are positive, negative, or neutral [2], the Arabic language is divided into three categories: Classical Arabic, Modern Standard Arabic, and Dialect Arabic, (Soliman et, al 2014), Sentiment analysis has been extensively studied in English language literature [3] [13] and many NLP tools are available for this. A significant contribution has been made to the development of sentiment analysis for text messages by researchers from Cornell University (B. Pang and L. Lee) [4][5][6] in 2008. Bang and Lee published the book (Opinion mining and sentiment analysis) [4] which is dedicated to modern methods and techniques for analyzing feelings in text messages [5]. Compared to Arabic texts, relatively few works have been dedicated to study sentiment analysis [7]. Most of the web content is written in a Dialect that has not been studied as much, as far as we know. On the Internet, the content of official websites may be in classical Arabic, but most of the content of social networks is in local dialects, as we know, there are more than 65 local dialects distributed over 22 Arab countries, studies have shown that the software tools developed to address classical Arabic yield poor results with these dialects [8]. the Sudanese are definitely part of the Arab world. Official statistics indicate that the number of users of social media in Sudan reached 15 million people, those open platforms allowed young people to reflect their opinions in their circulating dialect, which increased the need of stakeholders and researchers in applying opinion analysis to the Arabic texts to study those opinions in different areas whether it was political, Economic, or areas of services, etc., where this study aimed at analyzing feelings, analyzing the text written in the Sudanese vernacular based on machine learning methods where the classifier is trained according to available training data. After that, it is possible to classify the Sentiment as positive, negative, and neutral, which was made on 1050 Facebook comments on Internet service in Sudan. As there is no dictionary for Sentiment analyzing about the Sudanese dialect, we have created a dictionary containing 1000 words categorized into positive and negative words. II. RELATED WORK Nafissa Yussupova, Diana Bogdanova, Maxim Boyko, emotion analysis to text in Russian based on machine learning approaches [9]. Description of the researchers the problem of sentiment classification in text messages in Russian using machine learning methods - Naive Bayes translator and Support Vector. One of the features of the Russian language is the use of a variety of endings depending on deviation, times, and grammatical gender. Another common problem in sentiment classification for different languages is that different words can have the same meaning (synonyms), thus they may give equal emotional value. Therefore, their task was, how the reduction affects the accuracy of sentiment classification (or other, with or without endings), and to compare the results to Russian and English languages. To assess the effect of synonyms, they use the method of merging RESEARCH ARTICLE OPEN ACCESS