An improved offline handwritten character segmentation algorithm for Bangla script Samir Malakar 1 , Prasenjit Ghosh 2 , Ram Sarkar 2 , Nibaran Das 2 , Subhadip Basu 2 , Mita Nasipuri 2 1 Dept. of Master of Computer Application, M.C.K.V. Institute of Engineering, Liluah, Howrah, India 2 Dept. of Computer Science and Engineering, Jadavpur University, Kolkata, India {malakarsamir, prasenjit.16, raamsarkar, nibaran, bsubhadip, mitanasipuri}@gmail.com Abstract. Effective segmentation of offline handwritten word images of uncon- strained handwritten Bangla script is a challenging problem in Optical Charac- ter Recognition (OCR) application. Presence of a continuous horizontal line called ‘Matra’ is an important feature of this script. However, in unconstrained cursive handwriting, Matra can be wavy or discontinuous, makes the problem of segmentation difficult. The current work designs a novel technique for iden- tification of potential segmentation points on the Matra for isolating constituent characters from the word image of Bangla script. In the first stage, 8-neighbour Connected Component Labelling (CCL) algorithm is applied to identify con- nected sub-parts of the word images. These connected components are then classified into either of the two classes, namely ‘Segment further’ (SF) and ‘Do Not Segment’ (DNS). In the second stage, the trivial SF and DNS components are separated. Then the remaining components are classified into SF and DNS using a Multi-Layer Perceptron (MLP) based classifier. Finally, fuzzy segmen- tation features are used over the SF components to identify potential segmenta- tion points on the detected fuzzy Matra region for extraction of constituent characters or character sub-parts from the overall word images. The present technique has been successfully applied on 500 handwritten Bangla word im- ages and it is also found that the technique performs better than our earlier character segmentation techniques [1-2]. Keywords: Character segmentation, handwritten word images, OCR system, Bangla script, Fuzzy features, Component classification. 1 Introduction Character segmentation seeks to decompose word images into sub-images of consti- tuent characters. It is needed prior to machine recognition of individual character Proceedings of the Fifth Indian International Conference on Artificial Intelligence 71