Jyoti Mahajan and Rohini Mahajan/ Elixir Comp. Sci. & Engg. 69 (2014) 23716-23719 23716
Introduction
A character can be written in a number of ways differing in
shape and properties, such as Tilt, stroke, Cursively etc.
Although there are different types of Fonts which have different
italics and different in any commonly used Word Processing
Application Software yet while perceiving any text written in a
variety of ways, humans can easily recognize each character
because the human perception extracts the features of the image
of the character in retina that define a character’s shape in an
overall fashion but modeling the human perception model in
machines, this task becomes a hard problem. Optical Character
Recognition, usually abbreviated to OCR, is the mechanical or
electronic translation of images of handwritten, typewritten or
printed text into machine-editable text. The images are usually
captured by a scanner. However, throughout the text, we would
be referring to printed text by OCR. Data Entry through OCR
has faster speed, more accuracy, and generally more efficiency
than keystroke data entry. Basically, there are three types of
OCR. In Offline Handwritten text is produced by a person by
writing with a pencil on a paper medium and then scanned into
digital format using scanner. Online Handwritten Text is written
directly on a digitizing tablet using stylus. The output is a
sequence of x-y coordinates that express pen position as well as
other information such as pressure (exerted by the writer) and
speed of writing. Machine Printed Text can be found commonly
in daily use produced by offset processes, such as laser, inkjet
and many more. Optical Character Recognition is used to
convert different types of documents, such as scanned paper
documents, PDF files or images captured by a digital camera
into editable and searchable data. The OCR technology can also
be used for Processing checks, Documenting library materials
and Storing documents, searching text and extracting data from
paper based documents
Review Of Literature
An Optical Character recognition system based on Artificial
Neural Networks (ANNs) is trained using the Back Propagation
algorithm where each typed English letter is represented by
binary numbers that are used as input to a simple feature
extraction system whose output, in addition to the input, are fed
to an ANN. After the Feed Forward Algorithm which gives
workings of a neural network the Back Propagation Algorithm
follows which compromises of Training, Calculating Error, and
Modifying Weights. Artificial neural networks are commonly
used to perform character recognition due to their high noise
tolerance. The systems have the ability to yield excellent results.
The feature extraction step of optical character recognition is the
most important. A poorly chosen set of features will yield poor
classification rates by any neural network. The most straight
forward way of describing a character is by the actual raster
image. Another approach is to extract certain features that still
A Proposed method for designing an intelligent system for optical handwritten
character recognition
Jyoti Mahajan
1,*
and Rohini Mahajan
2
1
Government College of Engineering & Technology, Jammu.
2
School of Engineering, Shri Mata Vaishno Devi University, Katra
ABSTRACT
The accurate recognition of Latin-script, typewritten text is now considered largely a
solved problem. Typical accuracy rates exceed 99%, although certain applications
demanding even higher accuracy require human review for errors. Other areas—including
recognition of hand printing, cursive handwriting, and printed text in other scripts
(especially those with a very large number of characters)--are still the subject of active
research. Recognition of cursive text is an active area of research, with recognition rates
even lower than that of hand-printed text. Higher rates of recognition of general cursive
script will likely not be possible without the use of contextual or grammatical information.
For example, recognizing entire words from a dictionary is easier than trying to parse
individual characters from script. Reading the Amount line of a cheque (which is always a
written-out number) is an example where using a smaller dictionary can increase
recognition rates greatly. Knowledge of the grammar of the language being scanned can
also help determine if a word is likely to be a verb or a noun, for example, allowing
greater accuracy. The shapes of individual cursive characters themselves simply do not
contain enough information to recognize all handwritten cursive script accurately (greater
than 98%). It is necessary to understand that OCR technology is a basic technology also
used in advanced scanning applications. Due to this, an advanced scanning solution can be
unique and patented and not easily copied despite being based on this basic OCR
technology. In this paper, an intelligent system for “OPTICAL CHARACTER
RECOGINITION” using Artificial Neural Network based approach and a Feature
Extraction algorithm before an ANN can be applied for classification of characters which
promises to provide increased efficiency for the character recognition is proposed.
© 2014 Elixir All rights reserved.
Elixir Comp. Sci. & Engg. 69 (2014) 23716-23719
Computer Science and Engineering
Available online at www.elixirpublishers.com (Elixir International Journal)
ARTICLE INFO
Article history:
Received: 7 November 2013;
Received in revised form:
20 April 2014;
Accepted: 28 April 2014;
Keywords
Latin-script,
Typewritten,
Recognition.
Tele:
E-mail addresses: rohinimahajan11@yahoo.in
© 2014 Elixir All rights reserved