Bangla off-line Handwritten Character Recognition Using Superimposed Matrices Ahmed Shah Mashiyat Ahmed Shah Mehadi Kamrul Hasan Talukder Computer Science and Engineering Pioneer computers, Sheltech Computer Science and Engineering Discipline, Khulna University, Sierra, New elephant road, Discipline, Khulna University, Khulna-9208, Bangladesh. Dhaka-1205, Bangladesh. Khulna-9208, Bangladesh. mashiyatkucse@yahoo.com a_mehadi@yahoo.com kamrul9375@yahoo.com Abstract This paper presents an off-line recognition system for Bangla handwritten characters using superimposed matrices. It is observed that, in all cases, the same character written by different individuals shows at least a minimum level of similarity. In this system, the Bangla text, accepted as an image file, is first segmented into lines and words and then each word is segmented into characters. Then the boundary of each character is determined. The characters are scaled to a standard size using an image scaling algorithm and are stored in a 32X32 matrix. This matrix is then compared with a knowledge base where all recognized characters given by various persons are stored in superimposed form. Finally, depending on the similarity of the character with the stored one, the system recognizes the character to use in the output. This system is suitable to convert handwritten texts into printed documents. Keywords Text segmentation, character recognition, superimposed matrices, pattern recognition, water reservoir principal. INTRODUCTION Bangla is one of the richest languages of the world. More then 200 million people use Bangla as their medium of communication. So, scientists all over the world are trying to computerize the Bangla language. And in this trend character recognition plays an important role. Although printed Bangla character and numerals recognition is on its way to being solved, producing excellent recognition rates, researchers concentrating on the recognition of handwritten words cannot boast the same success. This has been ascribed to the difficult nature of unconstrained Bangla handwriting, including the diversity of character patterns, ambiguity and illegibility of characters and the overlapping nature of handwriting. But the research is on its way. The character recognition process is accomplished by using DP approach [1], Intelligent regional search [3], Artificial neural network [4,5,8,9], Self-Organizing Maps [10], Hidden Markov models [11] with satisfactory accuracy. In this paper, we have proposed a system which use supper imposed matrices for character recognition. Due to the cursive ness and touching characteristics of the handwritten Bangla text, we have focused on the segmentation phase of the text document which is to be processed. Our system takes the page of a handwritten document as an input, detect the boundary of the document, segment the document into lines, words and characters, detect the boundary of the character, remove noise, Scale the character, extract the features, identify the class, search for matching and finally print the recognized character. The system block diagram is given in Figure1 Figure 1: A block diagram representing the system BANGLA CHARACTER SET IN OUR VIEW Bangla has 50 letters in alphabet of which 11 letters are vowel (Sorborno) and 39 letters are consonant (Banjonborno). There are also 10 vowel modifiers (i.e Kar) and 7 consonant modifiers (i.e Fala) and 10 digits in Bangla character set. Besides these, there are more than about 253 compound character composed of 2, 3, or 4 consonants ( 200 compound characters composed of 2 consonants, 51 compound characters composed of 3 consonants and 2 compound characters composed of 4 consonants) [6]. As a result, the total number of pattern to be recognized is more than 310. It is very difficult to recognize a single character form the large number of characters. To get better performance of the system we have employed a grouping concept. We defined a group as a class. There are many similar features in the characters. However, some very distinct features have also been seen in some characters that make them completely different from others [5]. These features could help us in forming the classes. We have classes with ‘Matra’, left vertical line, right vertical line, upper part, disjoint part, lower part and with their composites. Some classes are shown in Figure 2. Figure 2: Features of some class RECOGNITION SYSTEM The main phase of the recognition system can be divided as segmentation of text into characters and identify the characters. The whole process is described as follows: