© Springer International Publishing Switzerland 2015 S.C. Satapathy et al. (eds.), Proc. of the 3rd Int. Conf. on Front. of Intell. Comput. (FICTA) 2014 551 – Vol. 1, Advances in Intelligent Systems and Computing 327, DOI: 10.1007/978-3-319-11933-5_62 Word-Level Script Identification from Handwritten Multi-script Documents Pawan Kumar Singh * , Arafat Mondal, Showmik Bhowmik, Ram Sarkar, and Mita Nasipuri Department of Computer Science and Engineering, Jadavpur University, Kolkata, India pawansingh.ju@gmail.com Abstract. In this paper, a robust word-level handwritten script identification technique has been proposed. A combination of shape based and texture based features are used to identify the script of the handwritten word images written in any of five scripts namely, Bangla, Devnagari, Malayalam, Telugu and Roman. An 87-element feature set is designed to evaluate the present script recognition technique. The technique has been tested on 3000 handwritten words in which each script contributes about 600 words. Based on the identification accuracies of multiple classifiers, Multi Layer Perceptron (MLP) has been chosen as the best classifier for the present work. For 5-fold cross validation and epoch size of 500, MLP classifier produces the best recognition accuracy of 91.79% which is quite impressive considering the shape variations of the said scripts. Keywords: Script identification, Handwritten Indic scripts, Texture based fea- ture, Shape based feature, Multiple Classifiers. 1 Introduction India is a multi-lingual country where people reside at different sections use different languages/scripts. Each script has its own characteristics which is very different from other scripts. Therefore, in this multilingual environment, to develop a successful Optical Character Recognition (OCR) system for any script, separation or identifica- tion of different scripts beforehand is utmost important because it is perhaps impossi- ble to design a single recognizer which can identify a variety of scripts/languages. Script identification facilitates many important applications such as sorting the docu- ment images, selecting appropriate script specific OCR system and searching digi- tized archives of document images containing a particular script, etc. Resemblances among the character set of different scripts are more feasible for handwritten documents rather than for the printed ones. Cultural differences, individ- ual differences, and even differences in the way people write at different times, en- large the inventory of possible word shapes seen in handwritten documents. Also, * Corresponding author.