Image Segmentation By Learning Approach
Horacio Andrés Legal-Ayala, Jacques Facon
Pontifical Catholic University of Parana (PUCPR)
Postgraduate Program in Applied Informatics (PPGIA)
Rua Imaculada Conceição, 1115 – Prado Velho, Curitiba , PR, Brazil
E:mail: {horacio, facon}@ppgia.pucpr.br
Abstract
This article describes a new segmentation by
thresholding approach based on learning. The method
consists in learning to threshold correctly submitting both
an image and its ideal thresholded version. From this
stage it is generated a decision matrix for each pixel and
each gray level that is re-utilized at the moment of the
new images segmentation. The new image is thresholded
by means of a new strategy based on the nearest
neighbors, that seeks, for each pixel of this new image,
the best solution in the decision matrix. Performed tests
on handwritten documents showed promising results. In
terms of quality of the results, the developed technique is
equal or superior to the traditional segmentation by
thresholding techniques, with the advantage that the one
discussed here does not requires the use of heuristic
parameters.
1. Introduction
Image segmentation is an indispensable task in the
area of image processing. It is usually necessary to extract
from an image its connected components semantically
relevant, and to represent them efficiently. It is possible to
find in the literature a wide variety of segmentation
techniques. Because of its very simple principle,
segmentation by thresholding is probably one of the most
traditional and popular segmentation by region
techniques. There are basically two approaching families
of segmentation by thresholding, the global thresholding
where a single value is searched for the entire image and
the adaptive or local thresholding, where a threshold
value is searched for each pixel or group of pixels.
However, in many applications where images present
illumination problems, bad gray level distribution
between relevant foreground and the background and
quality fault, segmentation by thresholding results worse
than expected, many times.
It can be said, without too much doubt, that image
segmentation is a very frustrating task in the field of
image processing. The user knows visually what to
extract from the image, but usually computer can not
execute or reproduce automatically its objective.
Studies about computational learning are not really
new. An informal definition in [1] affirms: “It is said that
a computer program, in order to execute a task, was
acquired by learning if it was acquired by any means
except by explicit programming”. Recently Barrera et al
[2] and Hirata [3] used this model to formalize binary
morphologic operators by learning, thus limiting the
segmentation only to binary cases. Kim [4], working
though with gray scale images, designed digital filters by
learning using samples images. In this approach, the
window size must be chosen manually, selecting by the
nearest neighbor technique, the one whose mean
difference of both patterns pixel by pixel is lower.
Genetic algorithms [5] and neural networks [6] are also
widely used approaches that attempt to learn in order to
segment. In the case of neural networks, these are used in
training a great group of pairs of images sample in/ideal
out. Techniques that use the aid of genetic algorithms, try
to obtain the best set of features in this trainings.
This paper presents a new segmentation by
thresholding approach based on learning. The main idea
is to use and take advantage of the user’s knowledge and
his aim to transform a gray level image into a binary
image. If the user knows what to extract from an image,
why not to share this knowledge with the computer?
The present approach consists in teaching the
computer how to learn to threshold, employing for this
purpose images that possess known solutions. Differently
from the neural networks or genetic training algorithms,
this approach learns from a single pair of sample/ideal
images.
The process consists of two stages, learning and
segmentation. In the first stage, an image is submitted in
order to extract the relevant features around each pixel’s
neighborhood along with its ideal segmented image. The
second stage consists in segmenting new images
employing de built decision matrix.
Section 2 explains the learning methodology including
the selection of the features and the appropriate window
size. Section 3 presents the segmentation technique for
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003)
0-7695-1960-1/03 $17.00 © 2003 IEEE