ORIGINAL ARTICLE Urdu Nasta’liq text recognition system based on multi- dimensional recurrent neural network and statistical features Saeeda Naz 1,4 · Arif I. Umar 1 · Riaz Ahmad 2,5 · Saad B. Ahmed 3 · Syed H. Shirazi 1 · Muhammad I. Razzak 3 Received: 26 April 2015 / Accepted: 26 August 2015 © The Natural Computing Applications Forum 2015 Abstract Character recognition for cursive script like Arabic, handwritten English and French is a challenging task which becomes more complicated for Urdu Nasta’liq text due to complexity of this script over Arabic. Recurrent neural network (RNN) has proved excellent performance for Eng- lish, French as well as cursive Arabic script due to sequence learning property. Most of the recent approaches perform segmentation-based character recognition, whereas, due to the complexity of the Nasta’liq script, segmentation error is quite high as compared to Arabic Naskh script. RNN has provided promising results in such scenarios. In this paper, we achieved high accuracy for Urdu Nasta’liq using statis- tical features and multi-dimensional long short-term memory. We present a robust feature extraction approach that extracts feature based on right-to-left sliding window. Results showed that selected features signiﬁcantly reduce the label error. For evaluation purposes, we have used Urdu printed text images dataset and compared the proposed approach with the recent work. The system provided 94.97 % recognition accuracy for unconstrained printed Nasta’liq text lines and outperforms the state-of-the-art results. Keywords Multi-dimensional recurrent neural network · Long short-term memory · OCR · Urdu 1 Introduction The excruciating advancement in technology, especially in document image analysis, has been an evident of reliable and efﬁcient OCR systems since last few decades. In an era of globalization, online information access and communi- cation technology have provoked the publishing bodies to make documents available in local and national languages using legacy technology. These documents can be news- papers, novels, stories, proverbs and books. Most of the documents are in the form of images. The legacy tech- nology makes the job tedious for the purpose to transfer, maintain and access such documents over that internet bearing the restriction of low bandwidth. Moreover, such image documents are unsearchable, are uneditable and occupy more storage. Due to invention of android tech- nology and its use in smart phones, tablets and PDAs have made the accessibility and availability of internet with low cost. This prompts the researchers to propose such ideas which facilitate them to see images having text on their handheld devices. This text images can be printed or handwritten documents and images of signboards. There is & Muhammad I. Razzak imranrazak@hotmail.com Saeeda Naz saeedanaz292@gmail.com Arif I. Umar ariﬁqbalumar@yahoo.com Riaz Ahmad rahmad@rhrk.uni-kl.de Saad B. Ahmed isaadahmed@gmail.com Syed H. Shirazi mirpak@gmail.com 1 Department of Information Technology, Hazara University, Mansehra, Pakistan 2 University of Technology, Kaiserslautern, Germany 3 King Saud Bin Abdul Aziz University for Health Sciences, Riyadh, Saudi Arabia 4 Higher Education Department, GGPGC, No.1, Abbottabad, KPK, Pakistan 5 Shaheed Benazir Bhutto University, Sheringal, Pakistan 123 Neural Comput Applic DOI 10.1007/s00521-015-2051-4