Image Segmentation By Learning Approach Horacio Andrés Legal-Ayala, Jacques Facon Pontifical Catholic University of Parana (PUCPR) Postgraduate Program in Applied Informatics (PPGIA) Rua Imaculada Conceição, 1115 – Prado Velho, Curitiba , PR, Brazil E:mail: {horacio, facon}@ppgia.pucpr.br Abstract This article describes a new segmentation by thresholding approach based on learning. The method consists in learning to threshold correctly submitting both an image and its ideal thresholded version. From this stage it is generated a decision matrix for each pixel and each gray level that is re-utilized at the moment of the new images segmentation. The new image is thresholded by means of a new strategy based on the nearest neighbors, that seeks, for each pixel of this new image, the best solution in the decision matrix. Performed tests on handwritten documents showed promising results. In terms of quality of the results, the developed technique is equal or superior to the traditional segmentation by thresholding techniques, with the advantage that the one discussed here does not requires the use of heuristic parameters. 1. Introduction Image segmentation is an indispensable task in the area of image processing. It is usually necessary to extract from an image its connected components semantically relevant, and to represent them efficiently. It is possible to find in the literature a wide variety of segmentation techniques. Because of its very simple principle, segmentation by thresholding is probably one of the most traditional and popular segmentation by region techniques. There are basically two approaching families of segmentation by thresholding, the global thresholding where a single value is searched for the entire image and the adaptive or local thresholding, where a threshold value is searched for each pixel or group of pixels. However, in many applications where images present illumination problems, bad gray level distribution between relevant foreground and the background and quality fault, segmentation by thresholding results worse than expected, many times. It can be said, without too much doubt, that image segmentation is a very frustrating task in the field of image processing. The user knows visually what to extract from the image, but usually computer can not execute or reproduce automatically its objective. Studies about computational learning are not really new. An informal definition in [1] affirms: “It is said that a computer program, in order to execute a task, was acquired by learning if it was acquired by any means except by explicit programming”. Recently Barrera et al [2] and Hirata [3] used this model to formalize binary morphologic operators by learning, thus limiting the segmentation only to binary cases. Kim [4], working though with gray scale images, designed digital filters by learning using samples images. In this approach, the window size must be chosen manually, selecting by the nearest neighbor technique, the one whose mean difference of both patterns pixel by pixel is lower. Genetic algorithms [5] and neural networks [6] are also widely used approaches that attempt to learn in order to segment. In the case of neural networks, these are used in training a great group of pairs of images sample in/ideal out. Techniques that use the aid of genetic algorithms, try to obtain the best set of features in this trainings. This paper presents a new segmentation by thresholding approach based on learning. The main idea is to use and take advantage of the user’s knowledge and his aim to transform a gray level image into a binary image. If the user knows what to extract from an image, why not to share this knowledge with the computer? The present approach consists in teaching the computer how to learn to threshold, employing for this purpose images that possess known solutions. Differently from the neural networks or genetic training algorithms, this approach learns from a single pair of sample/ideal images. The process consists of two stages, learning and segmentation. In the first stage, an image is submitted in order to extract the relevant features around each pixel’s neighborhood along with its ideal segmented image. The second stage consists in segmenting new images employing de built decision matrix. Section 2 explains the learning methodology including the selection of the features and the appropriate window size. Section 3 presents the segmentation technique for Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE