Stroke Analysis of Devnagari Handwritten Characters PRACHI MUKHERJI,* PRITI P. REGE ** Electronics and Telecommunication Department, * Sinhgad Technical Education Society’s SKNCOE, Pune 411041. ** College of Engineering Pune, Shivajinagar, Pune, 411005. INDIA Abstract: - Devnagari script is the major script of India and is widely used for various languages. In this work we propose a stroke based technique for analyzing handwritten Devnagari characters. After preprocessing the character is segmented in various strokes using our thinning and segmentation algorithm. We propose average compressed direction codes for segmented strokes to classify the strokes as left curve, right curve, horizontal stroke, vertical stroke and slanted lines etc. The knowledge of script grammar is applied to identify the character using shapes of strokes, mean, relative strength, straightness, circularity and area. The system tolerates slant of about 10º left and right and a skew of 5º up and down. The system gives high discrimination between similar characters and gives a recognition accuracy of 85%. Keywords: Devnagari Script, Character Segmentation, Thinning, Strokes, Average Compressed Direction Code, Unordered Stroke Matching. 1 Introduction Over 500 million people all over the world use Devnagari script. It provides written form to over forty languages [1] including Hindi, Konkani and Marathi. It is a logical composition of its constituent symbols in two dimensions [2]. A marked distinction in Devnagari script from the scripts of Roman genre is the fact that a character represents a syllabic sound, complete in itself. There has been intense research work done on the English, Latin, Chinese, Persian, Tamil, and Bangla scripts on both handwritten and machine printed texts. While most work has been published for printed Devnagari text, very little is reported for handwritten Devnagari script. One of the first attempts for handprinted characters has been by Sethi [3] and for typed Devnagari script by Sinha and Mahabala [4]. V. Bansal and Sinha in [6] divided the typed word in three strips and separated it in top strip, core strip and bottom strip and achieved 93% performance on individual characters. Pal and Chaudhuri have attempted OCR for two scripts, Bangla and Devnagari in [5]. In [7], Binary Wavelet transform is used for feature extraction of handwritten Devnagari characters. In [8], a survey of structural techniques used for feature extraction in OCR of different scripts is given. Recently in [9], Quadratic classifier based method is proposed with 81% accuracy. In this proposed work, we analyze 44 isolated handwritten characters in Devnagari script. The handwriting should be legible and adhering to structural syntax of Devnagari script. The processing steps of our OCR system can be summarized concisely. The documents are scanned and the digitized images are subjected to preprocessing techniques like filtering, binarization, skeletonization and pruning. The feature extraction module segments the character in strokes. Various features of these segmented 6th WSEAS International Conference on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, Dec 29-31, 2007 438