Classification of Arabic Writer Based on Clustering Techniques Ahmed Abdullah 41yr"6t(x), Mohammed Sabbih Al-Tamimi2. Omar Ismael Al-Sanjary3, and Ghazali Sulonga I Department of Computer Science, Kurdistan Technical Institute Sulaimani Heights, Sulaymaniyah,/Kurdisran Region, Iraq ahmed. abdullah@kti. edu. krd 2 Department of Computer Science, College of Science, University of Baghdad, Baghdad, Iraq m_af tamimi ? S@yahoo. com 3 Center of Scientific Research and Development Nawroz University - Kurdista:r ^. ^^-11':::*o^*l,o'o arD@r-J ar J.''o,ssfliai1. Com a School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu, 21030 Kuala Nerus, Terengganu, Malaysia ghazali@utmspace. edu. my Abstract. Arabic text categorization for pattem recognitions is challenging. We propose for the first time a novel holistic method based on clustering for clas- sifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Sec- ondly, their structural and statistical features are obtained from sectioned por- tions, Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generation step are obtained from multiple runs of individual clustering methods for each distance measures. The best results are achieved when intensity, lines slope and their combination set offeatures are applied. It is demonstrated that different numbers of clusters having good set offeatures can deliver significant improvements for the handwritten structures clustering. Keywords: Clustering . Writer identification Feature extraction Feature combinati.on . Distance measures L Introduction Undoubtedly, the pattern recognition is one of the signifi.cant areas in various engi- neering and scientiflc fields including computer vision, biology and artificial intelli- gence. Specifically, the Writer Identification (WI) in terms of handwriting analysis in pattern recognition is atffactive. Lately, the SrI for the samples of handwriting has widely been studied. In this regard, avaiety of writer recognition techniques have been @ Springer International Publishing AG 2018 F. Saeed et al. (eds.), Recent Trends in Information and Communication Technology, Lecture Notes on Data Engheering and Communications Technoloeies 5. DOI 10,10071978-3-319-59427-9 6