www.rspsciencehub.com Volume 02 Issue 08 August 2020 International Research Journal on Advanced Science Hub (IRJASH) 41 Comparative Analysis of Segmentation and Recognition Techniques for Offline Handwritten Words Monika Kohli 1 , Satish Kumar 2 1 Research Scholar, Department of Computer Science and Applications, Panjab University, Chandigarh. 2 Associate Professor, Department of Computer Applications, Panjab University, SSG Regional Centre Hoshiarpur, Punjab. monikakrajotia@gmail.com 1 Abstract A Pre-processing is the initial and vital phase in optical character recognition is the Pre-processing. Segmentation deals with the extraction of individual component from a document image. Number of techniques like projection profile, connected components, gaps between characters/components is reported in the literature for component extraction followed by feature extraction and recognition of the individual component. These techniques gives good results if components are isolated but fails if components are touched, shadowed or skewed. A novel technique is required to address such issues to enhance the recognition rate. The problem of segmentation for Roman script cursive handwriting is addressed by various authors but not enough addressed for Indian script especially Devanagari script. This paper is a review which is confined to offline handwritten script domain. It attempt to review various techniques for character segmentation considering touching characters for offline handwritten words in Devanagari script and scripts sharing similar characteristics (like Bangla, Gurumukhi), database used and their accuracy reported in the literature. Keywords: Devanagari script, OCR (Optical Character Recognition) Segmentation, Touching characters 1. Introduction OCR (Optical Character Recognition) is a conversion process which converts printed or handwritten data in the form of image, online or offline into machine encoded form. The purpose of converting data images into digital format is to edit and search data electronically, and store the digitized data in a compact way. ICR (Intelligent Character Recognition) more precise than OCR as different styles and fonts are made to learn by the computer system with major application as Automated Form processing. It has major advantages in term of speed, accuracy and cost. It reduces error as data entry (manually) is the likelihood of typographical errors. Devanagari script is widely used in northern and western part of India. There is more than 300 million user of the script and has various applications. Segmentation-based or holistic approached are used in literature for the recognition of Devanagari script. Recognition of number of languages is done using these approaches. Both approaches have shortcomings associated with them. However, Holistic approach does not give good results(Shaw, Parui, and Shridhar 2008) as per literature survey. Segmentation approach gives better results but Segmentation of Devanagari script is difficult because of presence of large character set which