IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 39, NO. 2, FEBRUARY 2001 303
Pixel Classification Using Variable String Genetic
Algorithms with Chromosome Differentiation
Sanghamitra Bandyopadhyay, Member, IEEE and Sankar K. Pal, Fellow, IEEE
Abstract—The concept of chromosome differentiation,
commonly witnessed in nature as male and female sexes, is
incorporated in genetic algorithms with variable length strings
for designing a nonparametric classification methodology. Its
significance in partitioning different landcover regions from
satellite images, having complex/overlapping class boundaries, is
demonstrated. The classifier is able to evolve automatically the
appropriate number of hyperplanes efficiently for modeling any
kind of class boundaries optimally. Merits of the system over the
related ones are established through the use of several quantitative
measure.
Index Terms—Genetic algorithms, hyperplane fitting, pattern
recognition, quantitative indices, remote sensing images.
I. INTRODUCTION
C
LASSIFICATION of pixels for partitioning different land-
cover regions is an important problem in the realm of satel-
lite imagery. Satellite images usually have a large number of
classes with overlapping and nonlinear class boundaries. Fig. 1
shows, as a typical example, the complexity in scatter plot of
932 points belonging to seven classes taken from the Systeme
Probatoire d’Observation de la Terre (SPOT) image of a part of
the city of Calcutta. Therefore, for appropriate modeling of such
nonlinear and overlapping class boundaries, the utility of an ef-
ficient search technique is evident. Moreover, it is desirable that
the search technique does not need to assume any particular dis-
tribution of the data set and/or class a priori probabilities.
Genetic algorithms (GAs) [1] are randomized search and op-
timization techniques guided by the principles of evolution and
natural genetics. They are efficient, adaptive and robust search
processes, producing near optimal solutions and have a large
amount of implicit parallelism. The utility of GAs in solving
problems that are large, multimodal and highly complex has
been demonstrated in several areas [2]. Since satellite images
usually have highly nonlinear and overlapping class boundaries,
application of GAs for searching for the appropriate ones, par-
ticularly under nonparametric conditions (i.e., without assuming
class distributions and a priori probabilities), seems appropriate
and natural.
In the present article, such an attempt is made by demon-
strating the effectiveness of a GA-based classifier, called the
variable string length GA with chromosome differentiation
(VGACD)-classifier, in partitioning different landcover re-
gions. In designing the VGACD-classifier, the concepts of
Manuscript received January 11, 2000; revised May 3, 2000.
The authors are with Machine Intelligence Unit, Indian Statis-
tical Institute, Calcutta, India (e-mail: sanghami@www.isical.ac.in;
sankar@www.isical.ac.in).
Publisher Item Identifier S 0196-2892(01)01172-X.
Fig. 1. Scatter plot for a training set of SPOT image of Calcutta containing
seven classes (1, ,7).
variable length strings in GAs (VGAs) [3] and chromosome
differentiation into two classes, male (M) and female (F),
have been integrated for approximating the class boundaries
of a given training data set nonparametrically by an optimum
number of hyperplanes such that the number of misclassified
points is minimized. Unlike the conventional GAs, in VGACD,
the length of a string is not fixed. Moreover, two classes of
chromosomes exist in the population. The crossover, imple-
menting a kind of restricted mating, and mutation operators are
accordingly defined. The fitness function rewards a string with
smaller number of misclassified samples, as well as smaller
number of hyperplanes. A comparison, in terms of several
quantitative measures [4] and visual quality of the classified
images, of the VGACD-classifier with VGA-classifier, i.e.,
the one incorporating variable string lengths but without
chromosome differentiation, and those based on the k-NN rule
and the Bayes maximum likelihood (ML) ratio is provided for
SPOT image of a part of the city of Calcutta.
II. DESCRIPTION OF THE VGACD-CLASSIFIER
A. Principle of Hyperplane Fitting
As mentioned, the classifier attempts to place a number
of hyperplanes in the feature space appropriately such that
the number of misclassified training points is minimized.
From elementary geometry, the equation of a hyperplane in
-dimensional space ( ) is given by
(1)
where . Here, is
the angle that the projection of the unit normal in the
space makes with the axis. Since
0196–2892/01$10.00 © 2001 IEEE