Research Article July
2017
© www.ijarcsse.com , All Rights Reserved Page | 206
International Journals of Advanced Research in
Computer Science and Software Engineering
ISSN: 2277-128X (Volume-7, Issue-7)
Review of Optical Character Recognition Techniques&
Applications
Arusa Firdous
M. Tech Student
Department of Computer Sciences,
Swami Devi Dyal Inst. of Engg. & Technology
Kurukshetra University, Kurukshetra, Haryana, India
Neha Pawar
Assistant Professor
Department of Computer Sciences,
Swami Devi Dyal Inst. of Engg. & Technology
Kurukshetra University, Kurukshetra, Haryana, India
Muheet Ahmed Butt
Scientist, PG Department of Computer Sciences,
University of Kashmir, Srinagar, Jammu and Kashmir,
India
Majid Zaman
Scientist, Directorate of IT&SS.
University of Kashmir, Srinagar, Jammu and Kashmir,
India
DOI: 10.23956/ijarcsse/V7I7/0158
Abstract: The Character Recognition of both keyboard typed and handwritten characters has still a long way to go in
terms of research. Although significant success has been achieved in type written characters but in handwritten it is
still to touch an appreciable level. Most of the methods that have been proposed in this regard have huge
computational complexity. The proposed review provides an in depth review of the OCR methods which include
segmentation, classification and recognition of characters independent in size and texture. The proposed review also
provides the literature survey in a summarized manner providing a comparative analysis of various OCR techniques.
Keywords: OCR, PSO, BFO, THD, EDM
I. INTRODUCTION
Optical Character Recognition focuses on extraction and interpretation of meaningful information pertaining to a
character from a digitized image in which scanned images of handwritten, typewritten text are converted into relevant
machine text.The typical Optical Character Recognition (OCR) systemsare based on three stages, preprocessing, features
extraction and discrimination. Each stage has its own problems, challenges and effects on the system efficiency which is
the time consuming and the recognition errors. In order to avoid these difficulties this review presents new construction
of OCR system for english characters.
An Optical Character Recognition System is software engineered to convert a text written on paper into machine editable
text formats. The character recognition over the past three decades has been an important area of research. The
importance of the character recognition assumed great significance ever since the office automation projects were taken
up. Presently the character recognition forms one of the most important activities in document processing. Considering
the fact that different languages have different character sets therefore intense research has been going on for automating
their recognition during document processing.
The conventional methods used for the recognition of the characters mostly use a matrix based approach where each
character is divided into a predefined number of rows and columns. Then depending upon the character under process a
particular set of cells in horizontal and vertical direction is selected. Similarly another approach called connected
component traces the character under process from one end to another to find its different parts. Approaches of this
nature involve excessive computations and are mostly time consuming. In these conventional approaches some pre-
processing like thinning is required before actually taking up the actual character for recognition.
Types of Optical Character Recognition
Optical Character recognition has been a subject of research. Pattern recognition has three main steps:
observation,
pattern segmentation,
and pattern classification.
Optical Character Recognition (OCR) systems is transforming large amount of documents, either printed alphabet or
handwritten into machine encoded text without any transformation, noise, resolution variations and other factors. In
general, handwriting recognition is classified into two types as
1. Off-line Character Recognition.
2. Online Character Recognition.
Off-line handwriting recognition involves automatic conversion of text into an image into letter codes which are usable
within computer and text-processing applications. Off-line handwriting recognition is more difficult, as different people
have different handwriting styles.