I.J. Information Technology and Computer Science, 2019, 6, 9-17
Published Online June 2019 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijitcs.2019.06.02
Copyright © 2019 MECS I.J. Information Technology and Computer Science, 2019, 6, 9-17
Off-line Sindhi Handwritten Character
Identification
Arsha Kumari
Department of Electronic Engineering, Mehran University of Engineering and Technology Jamshoro, Sindh Pakistan
E-mail: arsharathi56@gmail.com
Din Muhammad Sangrasi*, Sania Bhatti*, Bhawani Shankar Chowdhry** and Sapna Kumari*
*Department of Software Engineering Mehran UET Jamshoro, Sindh Pakistan
**Meritorious Professor, Faculty of Electrical, Electronics and Computer Engineering, MUET Jamshoro, Sindh
E-mail: din.muhammad@faculty.muet.edu.pk, sania.bhatti@faculty.muet.edu.pk, c.bhawani@ieee.org,
rathisapna65@gmail.com
Received: 26 March 2019; Accepted: 19 May 2019; Published: 08 June 2019
Abstract—Handwritten Identification is an ability of the
computer to receive and translate the intelligible
handwritten text into machine-editable text. It is
classified into two types based on the way input is given
namely: off-line and online. In Off-line handwritten
recognition, the input is given in the form of the image
while in online input is entered on a touch screen device.
The research on off-line and online handwritten Sindhi
character identification is on its very initial stage in
comparison to other languages. Sindhi is one of the
subcontinent's oldest languages with extensive literature
and rich culture. Therefore, this paper aims to identify
off-line Sindhi handwritten characters. In the proposed
work, major steps involve in characters identification are
training and testing of the system. Training is performed
using a feed-forward neural network based on the
efficient accelerative technique, the Back Propagation
(BP) learning algorithm with momentum term and
adaptive learning rate. The dataset of 304 Sindhi
handwritten characters is collected from 16 different
Sindhi writers, each with 19 characters. The novelty of
proposed work is the comparison of the recognition rate
for the single character, two characters and three
characters at a time. Results showed that the recognition
rate achieved for a single character is more than the
recognition rate of multiple characters at a time.
Index Terms—Off-line Handwritten, Neural Network
training, Back Propagation (BP) algorithm, Sindhi
Character identification.
I. INTRODUCTION
Handwritten character recognition (HCR) is advancing
the communication between human and computer; it
takes the world toward automation [1]. Off-line
handwritten recognition is a somehow easy and fast way
of inputting data to the computer. As plenty of Sindhi
literature is available in Sindh literature departments in
hard form, has taken too much space and will take too
much time to access any information. Therefore it is
necessary to preserve that information in the digitized
form so that globally everyone can access easily. Hence
Sindhi HCR is a very initial step to preserve the Sindhi
literature on the web to use it at the worldwide resource.
Handwritten recognition is a very challenging task in
computer vision and pattern recognition since every
writer has a different writing style, different shape of
characters and font, image quality [2]. As Sindhi is
cursive language in which characters are connected to
form words, hence it is a more difficult job when it comes
to recognize the off-line Sindhi handwritten characters.
Another problem in recognition of Sindhi characters is
similarity in 1) basic shape of characters, 2) position of
dots and 3) the number of dots of different character [3].
Though much work has been done on other languages
such as English [4], Chinese [5], Arabic [6] and other
languages but very less work has been done on Sindhi
HCR at the best of the knowledge, so a lot of work is
required to be done in this direction. Since Sindhi is the
regional and provincial language of Pakistan spoken by
60 million people in Sindh and different areas of the
world [7].
Basically, HCR is classified into two types first one is
off-line and another is online. Both these types vary from
one another by the way the input is given to the system.
In the off-line handwritten recognition, the input is given
in the form of a paper document, image etc. that will be
static in nature. While in online the input is given on the
touch screen device such as Tablet etc., that input will be
dynamic in nature.
The proposed work main purpose is to identify the off-
line handwritten Sindhi characters using the BP algorithm
with adaptive learning and momentum which reduce the
training time of the network. An additional contribution
of this research is to perform the comparative analysis in
the recognition rate for a single character and multiple
characters at a time. This system is based on the graphical
user interface (GUI) which is developed using MATLAB
2017a programming environment by utilizing its