Devanagari Isolated Character Recognition by using Statistical features ( Foreground Pixels Distribution, Zone Density and Background Directional Distribution feature and SVM Classifier) Mahesh Jangid Department of Computer Science & Engineering Dr. B R Ambedkar National Institute of Technology Jalandhar, India Abstract— In this paper, we present a methodology for off-line Isolated handwritten Devanagari character recognition. The proposed methodology relies on a three feature extraction techniques. The first technique is based on recursive subdivisions of the character image so that the resulting sub-images at each iteration have balanced (approximately equal) numbers of foreground pixels, as far as this is possible. Second technique is based on the zone density of the pixel and third is based on the directional distribution of neighboring background pixels to foreground pixels. The 314 sized feature vector is form from the three feature extraction techniques for a handwritten Devanagari character. The dataset (12240 samples) of handwritten Devanagari Character, have been prepared by writing the different – 2 people who belongs to different age group and obtained the 94.89 % recognition accuracy. Keywords- Davanagari Character Recognition, Forground pixel, Zone density, Background directional distribution, Support Vector Machine I. INTRODUCTION Machine simulation of human reading has become a topic of serious research since the introduction of digital computers. The main reason for such an effort was not only the challenges in simulating human reading but also the possibility of efficient applications in which the data present on paper documents has to be transferred into machine-readable format. Automatic recognition of printed and handwritten information present on documents like cheques, envelopes, forms, and other manuscripts has a variety of practical and commercial applications in banks, post offices, libraries, and publishing houses. Optical Character Recognition (OCR) is a field of research in pattern recognition, artificial intelligence and machine vision. OCR is a mechanism to convert machine printed or handwritten document file into editable text format. This field is broadly divided into two parts, Online and offline character recognition. Off-line Character recognition further divided into two parts, machine printed and handwritten character recognition. In handwritten Character Recognition, there are lots of problems as compare to machine printed document because of the different peoples have different writing styles, the size of pen-tip and some people have skewness in their writing. All this challenges make the researches to solve the problems. India is a multi-lingual multi-script country and there are twenty two languages. Eleven scripts are used to write these languages and Devnagari Script is an oldest one that is used to write many languages such as Hindi, Nepali, Marathi, Sindhi and Sanskrit where Hindi is the third most popular language in the world and it is the national language of the India [1]. 300 million people use the Devnagari Script for documentation in central and northern parts of India [2]. The script has a complex composition of its constituent symbols. Devanagari script (Hindi) has 13 vowels and 36 consonants shown in the figure 1. They are called basic characters. Vowels can be written as independent letters, or by using a variety of diacritical marks which are written above, below, before or after the consonant they belong to. When vowels are written in this way they are known as modifiers and the characters so formed are called conjuncts. Sometimes two or more consonants can combine and take new shapes. These new shape clusters are known as compound characters. All the characters have a horizontal line at the upper part, known as Shirorekha or headline. No English character has such characteristic and so it can be taken as a distinguishable feature to extract English from these scripts. In continuous handwriting, from left to right direction, the Shirorekha of one character joins with the Shirorekha of the previous or next of the same word. In this fashion, multiple characters and modified shapes in a word appear as a single connected component joined through the common Shirorekha. All the characters and modified shapes in a word. Also in Devanagari there are vowels, consonants, vowel modifiers and component characters, numerals. Moreover, there are many similar shaped characters. All these variations make the handwritten character recognition, a challenging problem. Mahesh Jangid / International Journal on Computer Science and Engineering (IJCSE) ISSN : 0975-3397 Vol. 3 No. 6 June 2011 2400