IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 2, APRIL 2009 369
Limitations of Existing Mutation Rate Heuristics
and How a Rank GA Overcomes Them
J. Cervantes and C. R. Stephens
Abstract—Using a set of different search metrics and a set of
model landscapes we theoretically and empirically study how “op-
timal” mutation rates for the simple genetic algorithm (SGA) de-
pend not only on the fitness landscape, but also on population size
and population state. We discuss the limitations of current muta-
tion rate heuristics, showing that any fixed mutation rate can be
expected to be suboptimal in terms of balancing exploration and
exploitation. We then develop a mutation rate heuristic that of-
fers a better balance by assigning different mutation rates to dif-
ferent subpopulations. When the mutation rate is assigned through
a ranking of the population, according to fitness for example, we
call the resulting algorithm a Rank GA. We show how this Rank
GA overcomes the limitations of other heuristics on a set of model
problems showing under what circumstances it might be expected
to outperform a SGA with any choice of mutation rate.
Index Terms—Genetic algorithms, optimization methods, search
methods.
I. INTRODUCTION
T
HERE has always been a strong interest in evolutionary
computation (EC) with respect to what are “optimal” pa-
rameter settings for a given class of evolutionary algorithms
(EAs) which has given rise to a correspondingly large literature
(see for example [1] for a recent overview). Of course, there are
many parameters that potentially affect the performance of an
EA—representation, fitness function, population size, genetic
operators, and their corresponding probabilities to name just a
few. In this paper, we will restrict attention to the area of varia-
tion operators and their associated probabilities for genetic algo-
rithms (GAs), concentrating especially on the mutation operator
and its associated probabilities.
Of course, even for this restricted class, the literature is large
(see, for example, [2], [3]). Methods for setting the associated
parameters can be divided up into two types—tuning and con-
trol. With tuning, the parameter is set before the run, while
with control it is adjusted during a run. Parameter tuning for
Manuscript received April 5, 2008; revised April 30, 2008. First published
September 26, 2008; current version published April 01, 2009. This work was
supported by the UNAM macroproyecto “Tecnologias para la Universidad de
la Información y la Computación.” The work of J. Cervantes was supported by
CONACYT through a doctoral fellowship, COMECYT though a thesis grant,
and PCIC-UNAM.
J. Cervantes is with the Instituto de Investigación en Matemáticas Aplicadas
y en Sistemas, UNAM Circuito Exterior s/n Ciudad Universitaria, Mexico City,
México D.F. 04510, and also with the Universidad Autónoma Metropolitana
Artificios s/n Del. Álvaro Obregón, Mexico City D.F. 01120, Mexico (e-mail:
jorgecervanteso@aim.com).
C. R. Stephens is with the Instituto de Ciencias Nucleares, UNAM Circuito
Exterior, Mexico City D.F. 04510, Mexico (e-mail: stephens@nucleares.unam.
mx).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TEVC.2008.927707
mutation in GAs has given rise to several well known heuris-
tics, such as the heuristic [4], where is the string
length; or the error threshold heuristic [5]–[7]. For parameter
control, in [8], a taxonomy for classifying such techniques was
introduced, dividing them into three subclasses—deterministic,
adaptive and self-adaptive. The deterministic class of [8] is a bit
of a misnomer, given that the parameter change can have a sto-
chastic component. The subclass is essentially associated with
parameter settings that receive no feedback from the population
dynamics. This type of mutation rate setting has been exten-
sively studied (see for example, [9]–[13]), often times with a
prescribed dynamic similar to that of a cooling schedule in sim-
ulated annealing. With adaptive schemes there is feedback from
the search that is used to determine the direction and/or mag-
nitude of a change in the mutation rate; the work of [14] and
[15] being of particular relevance to our results. Finally, in the
classification of [8] for control parameters, there is the class of
self-adaptive schemes [9], [12], [13] where parameter values are
encoded into the representation and are subject to evolution.
With any of the above parameter setting methods, it is
of course important to try and understand under what cir-
cumstances a given method might be expected to work well.
This is the difficult part. Even at the level of the simplest
heuristics, where the mutation rate is tuned, there have been
widely varying recommended “optimal” rates. De Jong [16]
recommended while Grefenstette [2] recommended
, a factor of ten difference! These heuristics are
completely universal in that the recommended values do not
depend on anything else, neither the representation, nor the
fitness function or the population. The next highest degree of
universality comes from the heuristic, which depends only
on the representation via the chromosome length, but not on
the fitness landscape. In this sense, such universal heuristics are
not optimized to a particular problem. Rather, it is hoped that
they give reasonable and robust performance across a wide set
of landscapes.
The error threshold is less universal, being associated with
what is considered to be an optimal balance between explo-
ration and exploitation. Theoretically, it is determined by
solving for the fixed point of the evolution in the approxi-
mation that back mutations are neglected. The example of a
“needle-in-a-haystack” landscape is treated in Section III-A1.
It can also be characterized phenomenologically. Essentially,
it is a critical mutation rate above which mutation produces
so many errors in fit genotypes that selection is not capable of
maintaining the majority of the population at a fitness peak.
Instead much of the population spreads out searching the rest of
the space. This heuristic does depend on the fitness landscape
as generally the higher the degree of selection the higher the
corresponding error threshold. A problem with using the error
1089-778X/$25.00 © 2008 IEEE