Ms.Rupali D. Dharmale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 5, Issue 3, ( Part -3) March 2015, pp.84-87 www.ijera.com 84 | Page Text Detection and Recognition with Speech Output for Visually Challenged Person: A Review 1 Ms.Rupali D. Dharmale, 2 Dr. P. V. Ingole 1 (Department of Electronics and Telecommunication, G. H. Raisoni college of Engineering and Management, Amravati, Maharashtra, India) 2 (Principal G. H. Raisoni college of Engineering and Management, Amravati, Maharashtra, India) ABSTRACT Reading text from scene, images and text boards is an exigent task for visually challenged persons. This task has been proposed to be carried out with the help of image processing. Since a long period of time, image processing has helped a lot in the field of object recognition and still an emerging area of research. The proposed system reads the text encountered in images and text boards with the aim to provide support to the visually challenged persons. Text detection and recognition in natural scene can give valuable information for many applications. In this work, an approach has been attempted to extract and recognize text from scene images and convert that recognized text into speech. This task can definitely be an empowering force in a visually challenged person's life and can be supportive in relieving them of their frustration of not being able to read whatever they want, thus enhancing the quality of their lives. Keywords: Optical character recognition (OCR), Text detection and recognition, Text to speech conversion, Visually challenged person. I. INTRODUCTION Every year, the number of visually challenged persons is increasing due to eye diseases diabetes, traffic accidents and other causes. Therefore applications that provide support to the visually challenged persons have become an important tool. Recent developments in computer vision, digital cameras, and computers make it possible to assist these persons by developing camera-based products that merge computer vision technology with other existing beneficial products such as optical character recognition (OCR) systems. When a visually challenged person is walking around, it is important to get text information which is present in the scene/text boards. Reading is obviously necessary in today’s society. Printed text is all over in the form of reports, receipts, bank documents, restaurant menu cards, classroom handouts, product packages, instructions on medicine bottles, etc. As an important form of communication, text is widely used in our daily life. For example, different sign boards, directions, shop names etc contain textual and/or symbolic information that is perceived by a human being to facilitate knowledge of environment and perhaps also help in his navigation. The need to read textual and/or symbolic information becomes essential in the case of blind or visually challenged persons. With this point of view, the system which detect the text from textual/symbolic board and recognize the text characters from the captured scene text image and finally, textual and/or symbolic information will be converted into speech. To extract text information from image, text detection and recognition algorithms are necessary. However extracting scene image’s text is a not easy task due to two key factors: 1) cluttered backgrounds with noise and non-text outliers, and 2) diverse text patterns such as character types, fonts, and sizes. The frequency of occurrence of text in scene image is very small, and a limited number of text characters are embedded into difficult non-text background outliers [1]. However, it is difficult to model the structure of text characters in scene images due to the lack of discriminative pixel-level appearance and structure features from non-text background outliers. Further, text consists of different words where each word may contain different characters in various fonts, styles, and sizes, resulting in large intra- variations of text patterns. To solve these difficult problems, scene text extraction is separated into two processes [2]: text detection and text recognition. Detection of text and classification of characters in scene images is a challenging visual recognition difficulty for visually challenged people. Text detection is used to localize image regions containing text characters and strings. It aims to remove most non-text background outliers[3]. Text recognition is to convert pixel-based text into readable code. It aims to accurately distinguish different text characters and properly composed text words. RESEARCH ARTICLE OPEN ACCESS