Innovative Segmentation of Handwritten Text in Assamese Using Neural Network Kaustubh Bhattacharyya and Kandarpa Kumar Sarma Department of Electronics Science, Gauhati University, Guwahati, Assam, India Abstract - In an Artificial Neural Network (ANN) based Optical Character Recognition (OCR) System, segmentation is an important component before recognition. Segmentation of words into individual letters has been one of the major problems in handwriting recognition. Despite several successful works all over the world, development of such tools in specific languages is still an ongoing process especially in the Indian context. This work explores the application of ANN as an aid to segmentation of handwritten characters in Assamese- an important language in the North Eastern part of India. The work explores the performance difference obtained in applying an ANN-based dynamic segmentation algorithm compared to projection- based static segmentation. Keywords: Segmentation, Multilayered, Recognition, Character, Handwritten, Neuron. . 1 Introduction Artificial Neural Network (ANN)s have been preferred tools for pattern recognition including optical characters which broadly constitutes an important segment of Computer Vision and Machine Learning. This is because ANNs have the capacity to learn, adopt to changing environments and demonstrate a computational paradigm that resembles the parallelism generated by the human brain. That way ANNs are smart tools and can be applied for a host of pattern recognition and prediction problems [1]. ANNs have two phases of working- first training during which it learns the patterns and testing which ascertains the extent of learning. Optical Character Recognition (OCR) is a common and popular example of application of ANNs for pattern recognition. As mentioned in [2], [3], segmentation plays an important role in the overall process of recognition of printed and handwritten characters. This is more so with cursive writing. Success and failure of an OCR system depends on the segmentation process. But this description is related to a work that attempts to use ANNs for the segmentation stage of an ANN based OCR system exclusively for Assamese which is an important language in NE region of India. The reasons behind the use of ANNs for segmentation are as below: 1. Static segmentation method suffers from a serious drawback that it cannot fix segmentation boundaries for cases where inputs have size and inclination variations. 2. Static Segmentation methods also fail to fix segmentation boundaries for cases where there are writer induced variations in inputs. Figure 1 shows the failure of a static segmentation method in dealing with writer induced variations. The solution for such cases can be given by ANN as these have the ability to learn shapes and that way discriminate segmentation boundaries. ANNs have been used for several character recognition systems. Some of the segmentation methods relevant in practice is described in [2]. For cursive writing Cheng, Liu et. al [4] provides a description of available segmentation methods. Uses of ANNs for segmentation have been reported by Blumenstein [5]. Other similar works are [6], [7], [8], [9], [10], [11], [12] to name a few. Very few known attempts have been reported regarding use of ANNs for segmentation in the Indian OCR scenario. This work is organized as below. Section 1 already provided an introduction and overview of the proposed system. Section 2 gives a short account of some of the important features of Assamese script. Details of the experimental work are included in Section 3. The experimental results are included in Section 4. The paper is concluded by Section 5 where the inference drawn has been included and certain future directions indicated.