Journal of Intelligent Learning Systems and Applications, 2012, 4, 207-215
http://dx.doi.org/10.4236/jilsa.2012.43021 Published Online August 2012 (http://www.SciRP.org/journal/jilsa)
1
A Multi-Agent Approach to Arabic Handwritten Text
Segmentation
Ashraf Elnagar
*
, Rahima Bentrcia
Computer Science Department, College of Sciences, University of Sharjah, Sharjah, UAE.
Email:
*
ashraf@sharjah.ac.ae
Received November 24
th
, 2011; revised April 19
th
, 2012; accepted April 26
th
, 2012
ABSTRACT
The segmentation of individual words into characters is a vital process in handwritten character recognition systems. In
this paper, a novel approach is proposed to segment handwritten Arabic text (words). We consider the “Naskh” font
style. The segmentation algorithm employs seven agents in order to detect regions where segmentation is illegal. Fea-
ture points (end points) are extracted from the remaining regions of the word-image. Initially, the middle of every two
successive end points is considered as a candidate segmentation point based on a set of rules. The experimental results
are very promising as we achieved a success rate of 86%.
Keywords: Character Segmentation; Handwritten Recognition Systems; Multi-Agents; Arabic Handwriting
1. Introduction
For the past three decades, there has been increasing in-
terest among researchers in problems related to hand-
written text segmentation and recognition regardless of
the language used [1]. Most of the handwriting recogni-
tion systems are based on segmentation, which is the
operation that seeks to decompose a word image into a
sequence of sub-images containing isolated characters.
Despite of the extensive work done on the off-line rec-
ognition of handwritten Latin and Asian languages text, a
small number of research papers and reports are pub-
lished in the recognition of Arabic handwriting [2]. This
is probably a result of a lack of adequate support in terms
of funding, and other utilities, such as comprehensive
and standard Arabic text databases, dictionaries, etc.; and
certainly due to difficulties associated with Arabic hand-
written text segmentation such as the cursive nature of
Arabic handwriting where most of the characters in a
single word are connected to each other. Another diffi-
culty is the existence of overlapping characters which are
not attached to each other but share horizontal space.
Due to difficulties mentioned above, many researchers
bypass the segmentation stage in developing a recogni-
tion system. However, this is not practical and insuffi-
cient in applications that require recognition of a large
number of vocabularies where several words may have
the same global shape, such as bank check processing,
postal address and zip code recognition [3,4], automated
handwritten document entry and understanding, mail sort-
ing, and other business and scientific applications. In addi-
tion, segmentation has an effective role in reducing the
complexity of recognition systems since the number of
recognition classes will be the number of Arabic letters
and not the possible combinations of them.
In this paper, we address the problem of segmenting
Arabic handwritten words into characters. The proposed
approach utilizes seven agents which cooperate to iden-
tify the regions where the insertion of segmentation
points is illegal. The segmentation algorithm is described
in Figure 1 as a block diagram. First, the image of the
Figure 1. The basic steps in the algorithm.
*
Corresponding author.
Copyright © 2012 SciRes. JILSA