RESEARCH ARTICLE
A Novel Hierarchical Technique for Offline Handwritten
Gurmukhi Character Recognition
Munish Kumar · M. K. Jindal · R. K. Sharma
Received: 2 August 2013 / Revised: 5 February 2014 / Accepted: 14 February 2014 / Published online: 18 November 2014
© The National Academy of Sciences, India 2014
Abstract The increasing need of a handwritten character
recognition system in the Indian offices such as banks, post
offices and so forth, has made it an imperative field of
research. In present paper, Authors have presented a novel
hierarchical technique for isolated offline handwritten
Gurmukhi character recognition. A robust feature set of
105 feature elements is proposed under this work for rec-
ognition of offline handwritten Gurmukhi characters using
four types of topological features, namely, horizontally
peak extent features, vertically peak extent features, diag-
onal features, and centroid features. For classification
Support Vector Machines (SVMs) classifier has been used
in this work. SVMs classifier has been considered with four
different kernels, namely, linear kernel, polynomial kernel,
RBF kernel and sigmoid kernel. For training and testing of
a classifier, we have used 3,500 samples of isolated offline
handwritten Gurmukhi characters written by one hundred
different writers. Maximum recognition accuracy of
91.80 % have been achieved with proposed technique,
while using PCA feature set and SVM with a linear kernel
classifier.
Keywords Character recognition · Feature extraction ·
Classification · Feature selection
Introduction
These days, we are being influenced a lot by computers
and approximately all the imperative processing is being
done electronically. Keeping in mind, today’s demand, it
becomes important that the transfer of data between the
human being and the computer should be simple and fast.
Optical Character Recognition (OCR) is an important area
of research, especially for handwritten text recognition.
Achievements of the commercially on hand OCR system
are yet to be widened to handwritten text recognition. It is
a laid down fact that frequent discrepancy in writing styles
of individuals makes recognition of handwritten characters
complicated. In recent times, offline handwritten Gur-
mukhi character recognition has been explored by
researchers owing to its practical usage. The offline
handwritten character recognition system consists of the
phases, namely, digitization, preprocessing, feature
extraction, and classification. Digitization is the process of
translating a paper based handwritten document into
electronic form using a scanner. Preprocessing is used to
extract meaningful information of a bitmap image. In this
phase, the bitmap image is transformed into a thinned
image using parallel thinning algorithm [25]. In feature
extraction phase, features of a thinned bitmap image of a
character are extracted. The performance of handwritten
character recognition system, primarily, depends on the
features that are being extracted. Authors have extracted
horizontally peak extent features, vertically peak extent
features, diagonal features and centroid features, in order
to find the feature set for a given character. Classification
M. Kumar (&)
Department of Computer Science, Panjab University Rural
Centre Kauni, Muktsar, Punjab, India
e-mail: munishcse@gmail.com
M. K. Jindal
Department of Computer Science and Applications, Panjab
University Regional Centre, Muktsar, Punjab, India
e-mail: manishphd@rediffmail.com
R. K. Sharma
School of Mathematics and Computer Applications, Thapar
University, Patiala, Punjab, India
e-mail: rksharma@thapar.edu
123
Natl. Acad. Sci. Lett. (November–December 2014) 37(6):567–572
DOI 10.1007/s40009-014-0280-1
Author's personal copy