International Journal of Scientific and Research Publications, Volume 4, Issue 8, August 2014 1 ISSN 2250-3153 www.ijsrp.org Analysis of structural features and classification of Gujarati consonants for offline character recognition Hetal R. Thaker * , Dr. C. K. Kumbharana ** * Assistant Professor, Department of MCA, Atmiya Institute of Technology & Science, Rajkot, India ** Head, Department of computer science, Saurashtra University, Rajkot, India Abstract- Wide range of applications and numerous other complexities involved in character recognition (CR) makes it a continuous and open area of research. Feature selection and classification plays major role in achieving higher accuracy for character recognition. In the era of digitization its compelling need to have CR system for regional script. This paper presents analysis of structural features and its classification for consonants of Gujarati script. Each character has certain characteristics which distinguishes it from other characters. Gujarati consonants are analyzed for eight such structural features and on the basis of it characters are categorized into twenty groups. Further Paper proposes decision table to classify characters based on structural features. Index Terms- Character Recognition, Handwritten Character Recognition, Gujarati character recognition, Structural feature analysis I. INTRODUCTION n the world massive data are available on a paper. For preserving this data in electronic format it requires it to be digitized by scanner which will save it in an image format. Certain operations such as searching and updating is difficult if data exist in image format it requires converting image into editable form. Converting image into editable form requires certain image processing operations such as preprocessing, segmentation, feature selection, feature extraction, classification, and recognition. Character recognition algorithm varies as diversities exist for language script and its characteristics such as direction of writing (i.e. left to right – English, Hindi, Gujarati), set of alphabets (i.e. English: A-Z, a-z), Nature of writing that defines how sentence are written (cursive script: English, Devnagari script: line at top of character and matras around). Many researchers have presented their work in the area of character recognition for English and Arabic script. Observation based on preliminary literature review indicates some work for South Indian script also, whereas very few research work is traced for character recognition in Gujarati script, which is an official language of Gujarat state, Western part of India. This paper focuses on analysis of structural feature and proposes analysis as decision table to classify offline Gujarati consonants. Paper is organized into different sections as previous work, Set of Gujarati consonants, Methodology for proposed work as structural feature selection, analysis in form of decision table for classification of Gujarati consonants. II. PREVIOUS WORK Process of extracting unique information from binary image is called feature extraction in an area of character recognition. Feature extraction is an important step [1] [2] where it requires extracting features which helps system in deciding the character. [1] For optical character recognition methods which are used for feature extraction can be broadly classified into Global transformation and series expansion, Statistical feature, Geometrical and topological features. Geometrical and topological feature extraction is one of the popular method among researcher [2]. Character is analyzed for its constitution which includes some simple geometrical shape that includes horizontal, vertical and slanted line to complex curve i.e. C- curve, D-curve, U-curve and certain other characteristics like close region, end point, cross point etc. Global and local properties of character identified by structural feature examination is a key to identify characters having distortions and style variations. Global and local properties like topology and geometry shapes in character. Suen et. al. have proposed many features in their work. [3]. Heutte et al. [4] has identified some structural features which include number of vertical and horizontal lines, intersections between the character and straight lines, holes position, end points, presence of loops number of intersections and junctions, number of loops. For recognizing handwritten numeral several structural features extracted by Lee et. al [5]. Feature includes number of central, left and right cavities, location of each central cavity, crossing, and number of crossing with principal and secondary axes, pixel distribution. In work presented by Amin et. al. [6] to recognize Arabic text, some structural features extracted are number of sub words, number and position of complimentary characters, number of loops in each peak, width and height of each pick. Based on structural feature [7] letters are determined. Structural feature selected for extraction are loop, line etc. Further post- processing is carried out by comparing output with dictionary word to aid accuracy. I