Off-line Farsi / Arabic Handwritten Word
Recognition Using Vector Quantization and Hidden
Markov Model
Behroz Vaseghi
Engineering Department,
Islamic Azad University,
Najafabad Branch, Iran
vaseghi@znu.ac.ir
Shahpour Alirezaee
Engineering Department,
University of Zanjan,
Zanjan, Iran
Alirezaee@znu.ac.ir
Majid Ahmadi
ECE Department,
University of Windsor,
Windsor, Ontario, Canada
Ahmadi@uwindsor.ca
Rasoul Amirfattahi
ECE Department,
Isfahan University of
Technology, Isfahan, Iran
fattahi@ iut.ac.ir
Abstract-In this paper a Farsi handwritten word recognition
system for reading city names in postal addresses is presented.
The method is based on vector quantization (VQ) and hidden
Markov model (HMM). The sliding right to left window is used to
extract the proper features(we have proposed four features).
After feature extraction, K-means clustering is used for
generation a codebook and VQ generates a codeword for each
word image. In the next stage, HMM is trained by Baum Welch
algorithm for each city name. A test image is recognized by
finding the best match (likelihood) between the image and all of
the HMM words models using forward algorithm. Experimental
results show the advantages ofusing VQIHMM recognizer engine
instead of conventional discrete HMM.
Keywords- Off-line character recogntion, HMM, VQ.
I. INTRODUCTION
During the past decade, pattern recognition community has
achieved very remarkable progress in the field of handwritten
word recognition. Many paper dealing with applications of
handwritten word recognition to automatic reading of postal
addresses, bank checks and forms (invoices, coupons, revenue
documents etc.) have been published [1]. However, most of the
works dealt with the recognition of Latin and Chinese scripts.
However progress in Arabic script recognition has been slow
mainly due to the special characteristics of Arabic scripts.
Arabic text is inherently cursive both in handwritten and
printed forms and is written horizontally from right to left. The
reader is referred to [2-3] for further details on Arabic script
characteristics and also the state of the art of Arabic character
recognition. Farsi writing, which this paper addresses, is very
similar to Arabic in terms of strokes and structure. The only
difference is that Farsi has four more characters than Arabic in
its character set. Therefore, a Farsi word recognizer can also be
used for Arabic word recognition. This paper present a Farsi
handwritten word recognition system based on discrete hidden
Markov model and vector quantization.
II. THE WORD RECOGNITION SYSTEM
The proposed system is designed for reading the city names
from address filed. This application falls within the limited
978-1-4244-2824-3/08/$25.00 ©2008 IEEE
575
lexicon category. The lexicon size is limited (198 city names)
or can be pruned by using additional information like Zip
codes. The block diagram of the system is illustrated in Fig. 1.
An image of postal envelope is captured using a scanner with
300-dpi resolution and 256 gray levels. Then the name of city
is extracted from the image and assigned an appropriate label
between 1 to 198. Our database consists of 6000 word image of
the cities in Iran.
A. Preprocessing
The preprocessing consists of the following steps:
• Binarization: The gray level image of a word is
binarized at a threshold determined by modified
version of maximum entropy sum and entropic
correlation methods. [4]
• Noise removal: The binarized image often has
spurious segments which are removed by a
morphological closing operation followed by a
morphological opening operation both with a 3x3
window as the structure element.
• To Surround: For decrease of the memory volume and
increase the speed of the processing the binarized
image is surrounded in a circumferential rectangular
(Fig.2b).
B. Frame generation
In this phase, the word image is converted to an appropriate
sequential form suitable for HMM recognition engine. The area
of the image is divided into a set of vertical fixed-width frames
(strips) from right to left. The width of a frame is set to
approximately twice of the average stroke width of the word
image and there is a 50% overlap between two consecutive
frames.
c. Feature extraction
In this stage from each frame of the image 4 features were
extracted:
Authorized licensed use limited to: University of South Australia. Downloaded on July 25, 2009 at 04:34 from IEEE Xplore. Restrictions apply.