International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-8 Issue-9, July 2019
1677
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: I8099078919/19©BEIESP
DOI:10.35940/ijitee.I8099.078919
Augmentation of Local, Global Feature Analysis for
online Character Recognition System for Telugu
Language using Feed Forward Neural Networks
(FFNN)
Goda Srinivasarao, Rajeswara Rao Ramisetty
Abstract: In this paper, we propose ANN based online
handwritten character recognition for Telugu Language. In
literature review, it is observed that Size of the database and
preprocessing approaches plays prominent role in the recognition
performance. Preprocessing techniques like normalization,
interpolation, Uniformization, Smoothing, Slant Correction and
resampling techniques are performed for better recognition
performance. Local features
like(x,y)co-ordinates,
) , ( y x ) , (
2 2
y x
and the global
features like tan() are considered as features for ANN modeling
and Classification of 52 Telugu vowels and consonants.
Recognition performance is evaluated by augmentation the local,
global features and and tan () Features. Theperformance is
evaluated in terms of precision, recall and F-measure. Significant
Improvement is reported by augmentation andby adopting
preprocessing techniques. The database used for the study is
HP-online Telugu database.
Index Terms — ANN, HP-database, local features, global
features
I. INTRODUCTION
Though languages like English can be or given as an input to
the computers to execute as commands or process the data. It
is not the same for quite a few languages like Telugu, Chinese,
Hindi and other Indian or Japanese languages. Because these
languages involve lot of stroke variations from writer to
writer. But, rather than giving input via keyboard or voice, it
is advisable to give it via handwritten samples (like
parchments of paper or electronic pens).
For instance, entering data into the database from the
hand-filled Railway-reservation applications is a tedious task
and can be automated. Moreover, properly trained systems
will be capable of recognizing the hand-written text better
than that of the human. And this handwriting recognition is
plays a crucial role in the human computer interaction model.
Efforts have already been made to build system in both online
and offline fields for achieving various aims, like recognizing
numeric characters, language recognitions like Assamese [2],
Thai [5], and Arabic [4].
Revised Manuscript Received on July 05, 2019.
Goda Srinivasarao, Research Scholor, Department of Computer Science
& Engineering, JNTUA-Anantapuramu, A.P, India
Rajeswara Rao Ramisetty, Professor, Department of Computer Science
& Engineering, JNTUK-Unviersity College of Engineering-Vizianagaram
Unlike English, the basic characters in Telugu script consist
of 16 vowels and 36 consonants. The characters in telugu
script are a combination of these basic characters and their
modifiers which gives rise to about 18,000 unique characters.
All these unique characters in Telugu can be represented as a
combination of a manageable set of 235 strokes. Also the
character strokes, other the first stroke taken as main stroke,
can be divided, based on the position of the stroke, into three -
top stroke, bottom stroke and baseline stroke. As a
preliminary attempt, we use character based recognition for
on-line handwriting recognition of Telugu which is a very
popular south Indian language, in which much research has
not yet done.
Telugu language found in the South Indian states of Andhra
Pradesh and Telangana as well as several other neighboring
states. Subset Telugu symbols given in the Figure 2. In
Telugu script, many of the characters resemble one another in
structure. The framework for online handwritten character
recognition is depicted in Figure 2.Further, many users write
two or more characters in a similar way which can be difficult
to classify correctly. In Telugu some of the confusing pairs are
there. An SVM based stroke recognition method used in [1]
for Telugu characters. Based on proximity analysis, the
recognized strokes are mapped onto characters using
information of stroke combinations for the script. Each stroke
is represented as preprocessed (x, y) coordinates. The data
sample size 37817 was collected from 92 users using the
SuperpenTM, a product of UC Logic. The observed
recognition accuracy is 83%. Importance of annotation of
online handwritten data illustrated in [1]. Modular approach
for recognition of strokes proposed in [2]. Based on the
relative position of strokes in the character, the strokes are
categorized into baseline, bottom, top strokes. The
recognition model SVM was used for each category
separately. The recognition accuracy is high for each stage,
when compared to combined classifier. Elastic matching
technique, DTW used in [3]. The features used are local
features: x-y features, Tangent Angle (TA) and Shape
Context (SC) features, Generalized Shape Context (GSC)
feature and the fourth set containing (x, y) coordinates,
normalized first and second derivatives and curvature
features.