54 International Journal of Information Retrieval Research, 1(3), 54-70, July-September 2011 Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Keywords: Arabic Text Classiﬁcation, Decision Tree, Naïve Bayes Classiﬁer (NB), Natural Language Processing, Stemming, Support Vector Machine (SVM), Text Classiﬁcation 1. INTRODUCTION The tremendous growth of available Arabic text documents on the Web and databases have posed a major challenge on researchers to find better ways to deal with such huge amount of information in order to enable search engines and information retrieval systems to provide relevant information accurately, which has become a crucial task to satisfy the needs of different end users. Text classifications, and its techniques, have become a major tool for dealing with the large amount of available data on the Web and databases. Text classification is the task of automatically assigning text documents to one or more predefined categories based on content and linguistic features (Gharib et al., 2009; The Effect of Stemming on Arabic Text Classiﬁcation: An Empirical Study Abdullah Wahbeh, Yarmouk University, Jordan Mohammed Al-Kabi, Yarmouk University, Jordan Qasem Al-Radaideh, Yarmouk University, Jordan Emad Al-Shawakfa, Yarmouk University, Jordan Izzat Alsmadi, Yarmouk University, Jordan ABSTRACT The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classiﬁcation is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the ﬁeld of text classiﬁcation, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classiﬁcation or classiﬁcation of Arabic text documents. It applies text classiﬁcation to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classiﬁcation without using stemming; the support vector machine (SVM) classiﬁer has achieved the highest classiﬁcation accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%. DOI: 10.4018/ijirr.2011070104