Color K-means, Gaussian Filter and Aperture Concept for Text Localization in Images K. J. Dayananda 1 and D. Puttegowda 2 1-2 Department of Computer Science and Engineering ATME College of Engineering, Mysuru, India Email: dayananda.kem@gmail.com, puttegowda.77@gmail.com Abstract— Text extraction in image is an essential role in machine learning and computer vision field. The text localization process determines the presence and location of text in the given inputs. The challenges, which are occurred during the text localization task is different orientation, low-resolution, complex background and illumination with variation in font size and color. In this research paper, the color k-means is applied to make separate group of colors present in the text image. Histogram equalization process is applied to sharpen the text pixels from the background pixels. The Gaussian filter is applied to combine all text pixels together. Aperture concept has implemented to locate the actual text regions. The standard datasets like hua’s and nus dataset were used to estimate the performance of the presented model. Precision, recall and f-measure is used to estimate the performance of the proposed model. Index Terms— Color K-means, Histogram Equalization, Gaussian filter, Aperture, Text localization. I. INTRODUCTION A text information conveys a set of meanings to the person which can be read and understand easily. Text in videos carries rich set of information. Text extraction from images are challenging research field that fascinates a huge number of scholars. Although various optical character recognition is developed still the problem of text localization is not thoroughly solved. Text localization determines the existence of texts in a particular place, where text localization identify the region of text pixels in an image. The main challenges arrived in the text localization are complex background, low resolution, alignments of characters, lighting and varied shapes, size and color of fonts. The motivation of text localization is an attempt towards the development of arbitrary oriented multilingual text localization. The previous models are able locate the text in clear background with low resolution, but most of the models fails to locate the text present illumination situation. This research work is able to locate the text in illumination situation with considering all kinds of challenges. In this work, as we have observed the text region has distinctive features compared to non-text regions and connectivity is different from its background. In the first step, we used a color k-means algorithm divides the input image into a set of colors given by k as a parameter. Then histogram equalization and Gaussian filter is used to perceive the regions of texture in an image with the help of the texture filter function. Gaussian filter makes edges of the image perceptible. The aperture concept is employed on the filtered image to extract the true text contents. Then false positives are eliminated by studying the bounding box region of text components and non-text components. By using k-means, Gaussian filter, and aperture feature we presented a text identification model for images and videos. Investigational results proved that the presented model efficiently locate the region Grenze ID: 01.GIJET.9.1.614 © Grenze Scientific Society, 2023 Grenze International Journal of Engineering and Technology, Jan Issue