ORIGINAL ARTICLE
Urdu Nasta’liq text recognition system based on multi-
dimensional recurrent neural network and statistical features
Saeeda Naz
1,4
· Arif I. Umar
1
· Riaz Ahmad
2,5
· Saad B. Ahmed
3
·
Syed H. Shirazi
1
· Muhammad I. Razzak
3
Received: 26 April 2015 / Accepted: 26 August 2015
© The Natural Computing Applications Forum 2015
Abstract Character recognition for cursive script like
Arabic, handwritten English and French is a challenging task
which becomes more complicated for Urdu Nasta’liq text
due to complexity of this script over Arabic. Recurrent neural
network (RNN) has proved excellent performance for Eng-
lish, French as well as cursive Arabic script due to sequence
learning property. Most of the recent approaches perform
segmentation-based character recognition, whereas, due to
the complexity of the Nasta’liq script, segmentation error is
quite high as compared to Arabic Naskh script. RNN has
provided promising results in such scenarios. In this paper,
we achieved high accuracy for Urdu Nasta’liq using statis-
tical features and multi-dimensional long short-term
memory. We present a robust feature extraction approach
that extracts feature based on right-to-left sliding window.
Results showed that selected features significantly reduce the
label error. For evaluation purposes, we have used Urdu
printed text images dataset and compared the proposed
approach with the recent work. The system provided 94.97 %
recognition accuracy for unconstrained printed Nasta’liq
text lines and outperforms the state-of-the-art results.
Keywords Multi-dimensional recurrent neural network ·
Long short-term memory · OCR · Urdu
1 Introduction
The excruciating advancement in technology, especially in
document image analysis, has been an evident of reliable
and efficient OCR systems since last few decades. In an era
of globalization, online information access and communi-
cation technology have provoked the publishing bodies to
make documents available in local and national languages
using legacy technology. These documents can be news-
papers, novels, stories, proverbs and books. Most of the
documents are in the form of images. The legacy tech-
nology makes the job tedious for the purpose to transfer,
maintain and access such documents over that internet
bearing the restriction of low bandwidth. Moreover, such
image documents are unsearchable, are uneditable and
occupy more storage. Due to invention of android tech-
nology and its use in smart phones, tablets and PDAs have
made the accessibility and availability of internet with low
cost. This prompts the researchers to propose such ideas
which facilitate them to see images having text on their
handheld devices. This text images can be printed or
handwritten documents and images of signboards. There is
& Muhammad I. Razzak
imranrazak@hotmail.com
Saeeda Naz
saeedanaz292@gmail.com
Arif I. Umar
arifiqbalumar@yahoo.com
Riaz Ahmad
rahmad@rhrk.uni-kl.de
Saad B. Ahmed
isaadahmed@gmail.com
Syed H. Shirazi
mirpak@gmail.com
1
Department of Information Technology, Hazara University,
Mansehra, Pakistan
2
University of Technology, Kaiserslautern, Germany
3
King Saud Bin Abdul Aziz University for Health Sciences,
Riyadh, Saudi Arabia
4
Higher Education Department, GGPGC, No.1, Abbottabad,
KPK, Pakistan
5
Shaheed Benazir Bhutto University, Sheringal, Pakistan
123
Neural Comput Applic
DOI 10.1007/s00521-015-2051-4