Handwritten kannada vowels and English character Recognition System International Journal of Image Processing and Vision Sciences (IJIPVS) Volume-1 Issue-1 ,2012 12 Handwritten Kannada Vowels and English Character Recognition System B.V.Dhandra , Gururaj Mukarambi Department of P.G.Studies and Research in Computer Science, Gulbarga University, Gulbarga, Karnataka dhandra_b_v@yahoo.co.in gmukarambi@gmail.com Mallikarjun Hangarge Department of Computer Science, Karnatak Arts, Science & Commerce College, Bidar, Karnataka. mhangarge@yahoo.co.in Abstract— In this paper, a zone based features are extracted from handwritten Kannada Vowels and English uppercase Character images for their recognition. A Total of 4,000 handwritten Kannada and English sample images are collected for classifications. The collected images are normalized into 32 x 32 dimensions. Then the normalized images are divided into 64 zones and their pixel densities are calculated, generating a total of 64 features. These 64 features are submitted to KNN and SVM classifiers with 2 fold cross validation for recognition of the said characters. The proposed algorithm works for individual Kannada vowels, English uppercase alphabets and mixture of both the characters. The recognition accuracy of 92.71% for KNN and 96.00% for SVM classifiers are achieved in case of handwritten Kannada vowels and 97.51% for KNN and 98.26% for SVM classifiers are obtained in case of handwritten English uppercase alphabets. Further, the recognition accuracy of 95.77% and 97.03% is obtained for mixed characters (i.e. Kannada Vowels and English uppercase alphabets). Hence, the proposed algorithm is efficient for the said characters recognition. The proposed algorithm is independent of thinning and slant of the characters and is the novelty of the proposed work. Keywords- DIA, OCR, KNN, SVM I. INTRODUCTION The document images can be classified into printed document images, handwritten document images and both. The analysis of such documents consists of number of paragraphs, columns, lines and words, scripts in document images and so on. The recognition of handwritten scripts of a document is one of the challenging tasks due to diversified style of writing. If a document contains only one script, then the corresponding OCR system can be used for document image processing. If a document contains more than one script then individual OCR systems may be used for the identification of the script/language for their processing. Now the effort is on to develop a multilingual OCR system for automatic reading and processing of the multilingual documents. In this direction a small amount of work is carried out in the Indian context. Hence, this has motivated to consider the study of handwritten Kannada vowels and English uppercase alphabets recognition system as the initial work to meet the objective of processing bi-lingual (Kannada & English) documents. In the literature many feature extraction methods are used at various levels for handwritten character recognition for various scripts/languages such as Fourier Descriptors, Shape descriptors, Spatial, Discrete Cosine Transform, Random Transform, Central Moments, Zernike Moments, Zone, Structural, Statistical, Optical Depth Decision tree etc. Hence in the proposed study an attempt is made to use the zone based features for the recognition of handwritten Kannada vowels and all the 26 uppercase characters of the English. In the following a brief account of the work carried out in the literature is presented. Dhandra et al [1] have proposed spatial features for handwritten Kannada and English Character Recognition, and have achieved the recognition accuracy of 90.01% for handwritten Kannada vowels and 91.04% for handwritten English uppercase alphabets. Phokharatkul et al [2] have proposed Ant-Miner Algorithm for handwritten Thai Character Recognition and have reported the recognition accuracy of 82.07%. R.M. Suresh et al [3] have proposed a fuzzy technique for Tamil handwritten character recognition and he reported recognition accuracy of 94%. Teng Long et al [4] have used Dynamic Time Warping (DTW) algorithm for handwritten English uppercase alphabets collected from camera user interface and reported a recognition accuracy of 97.3%, 98.6% and 98.8% with 16, 32 and 256 normalized discrete angle values. Ertugrul Saatci et al [5] have proposed Multiscale Handwritten Character Recognition Using CNN Image Filters for handwritten English uppercase alphabets and he has exhibited a recognition accuracy of 93%. Velappa Ganapathy et al [6] have proposed a method of Multiscale Neural Network Training Technique for handwritten English uppercase alphabets and he reported recognition accuracy of 85%. Dayashankar Singh et al [7] have proposed a directional features for Hindi and English characters and he reported a recognition accuracy of 97% with 12 directional input. Dharamveer Sharma et al [8] have proposed zone based features for isolated handwritten Gurumukhi script and reported recognition accuracy of 72.54%. S.Arora et al [9] have proposed a chain code features for handwritten Devnagari characters and reported recognition accuracy of 98.03%. From the literature survey,