ISSN: 0976-9102 (ONLINE) ICTACT JOURNAL ON IMAGE AND VIDEO PROCESSING, MAY 2020, VOLUME: 10, ISSUE: 04 DOI: 10.21917/ijivp.2020.0315 2195 ARABIC HANDWRITTEN CHARACTERS RECOGNITION VIA MULTI-SCALE HOG FEATURES AND MULTI-LAYER DEEP RULE-BASED CLASSIFICATION Soumia Djaghbellou 1 , Zahid Akhtar 2 , Abderraouf Bouziane 3 and Abdelouahab Attia 4 1,4 Department of Computer Science, Mohamed El Bachir El Ibrahimi University of Bordj Bou Arreridj, Algeria 2 Department of Computer Science, University of Memphis, United State of America ³LMSE Laboratory, Mohamed El Bachir El Ibrahimi University Bordj Bou Arreridj, Algeria Abstract Optical character recognition systems for handwritten Arabic language still face challenges, owing to high level of ambiguity, complexity and tremendous variations in human writing styles. In this paper, we propose a new and effective Arabic handwritten characters recognition framework using multi-scale histogram oriented gradient (HOG) features and the deep rule-based classifier (DRB). In the feature extraction stage, the proposed framework combines multi-scale HOG features, and then the DRB is applied on comprehensive HOG features to obtain the final classification label/class. This study involves experimental analyses that were conducted on the publicly available cursive Arabic Handwritten Characters Database (AHCD) containing 16800 characters. Experimental results demonstrate the efficacy of the proposed recognition system compared to the existing state-of-the-art- systems. Keywords: Arabic Character Recognition, Writing, DRB Classifier, HOG, AHCD 1. INTRODUCTION Optical character recognition (OCR) systems have attained considerable progress owing to its impressive accuracy, and have demonstrated promising prospects in the field of handwritten Arabic characters recognition [1]. The Arabic s has several different traits, thereby making it a unique language. Not only the dialects but also handwritings among the Arabic speakers and writers vary with respect to context and locations. The Arabic script includes 28 alphabets. As illustrated in Table.1, each alphabet can assume four to two shapes depending on its position in a word like the beginning, middle, and end or isolated. Hence, the position-based variability and the different writing styles of Arabic alphabets pose great challenges to automated character identification. Researchers have investigated various approaches for Arabic OCR by employing different techniques of preprocessing, features extraction and classification [2], e.g., recognition using segmentation [3], raw pixel data [4] and simple deep sparse auto encoder [4]. In particular, Shatnawi and Abdallah [6] have proposed a model to recognize characters with real-world distortions in Arabic handwriting using a dataset containing 48 examples of each Arabic handwritten character (i.e., 28 letters and 10 digits) obtained from 48 different writers. The model in [6] achieved a recognition rate of 73.4%. Elzobi et al. [7] have employed the Gabor wavelet transform to extract the mean and standard deviation of image. Classification stage was carried out by employing a Support Vectors Machine (SVM) classifier. The IESK-arDB [8] and IFN/ENIT [9] datasets were used to evaluate the proposed approach. The authors reported an average 71% recognition rate. But, the proposed scheme [7] suffers from high memory as well as run time requirements. Sahlol and Suen [10] have investigated several pre-processing schemes with various features for Arabic handwritten character recognition. The presented system in [10] was trained and tested with artificial neural network (ANN) on ENPRMI dataset. The reported results indicate that the system was capable to identify 88% of the test set correctly. Maqqor et al. [11] designed a system for handwritten Arabic texts utilizing sliding window technique for feature extraction with multiple classifiers. The evaluation of their model was performed on text images of IFN/ENIT database that achieved a recognition rate of 76.54%. From all of the preceding observations, it is inferred that Automatic recognition of cursive Arabic handwritten characters has comparatively received less attention, and hence, it is taken up as the main subject of this study. This paper presents a framework for cursive Arabic handwritten characters using multi-scales Histogram Oriented Gradient (HOG) descriptor for feature extraction and Deep rule based (DRB) classifier for classification. Empirical test results on the publicly available cursive Arabic Handwritten Characters Database (AHCD) containing 16800 characters are more promising and comparable than the existing methods in Arabic handwritten optical characters recognition. The remainder of this article is organized as follow: section 2 provides a brief description of the main Arabic script characteristics. The architecture of the proposed system, with details of various processing steps such as feature extraction and classification, is presented in sections 3. Experimental dataset, results and analysis are discussed in section 4. In section 5, conclusions are outlined. Table.1. Arabic alphabet shapes No. Name Isolated Connected Beginning Middle End 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Alif Baa Taa Thaa Jeem Haa Khaa Daal Thal Raa Zaa Seen Sheen Saad ا ب ت ث ج ح خ د ذ ر ز س ش ص ا بـ تـ ثـ جـ حـ خـ د ذ ر ز سـ شـ صـ ـا ـبـ ـتـ ـثـ ـجـ ـحـ ـخـ ـد ـذ ـر ـز ـسـ ـشـ ـصـ ـا ـب ـت ـث ـج ـح ـخ ـد ـذ ـر ـز ـس ـش ـص