IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 2, APRIL 2009 369 Limitations of Existing Mutation Rate Heuristics and How a Rank GA Overcomes Them J. Cervantes and C. R. Stephens Abstract—Using a set of different search metrics and a set of model landscapes we theoretically and empirically study how “op- timal” mutation rates for the simple genetic algorithm (SGA) de- pend not only on the fitness landscape, but also on population size and population state. We discuss the limitations of current muta- tion rate heuristics, showing that any fixed mutation rate can be expected to be suboptimal in terms of balancing exploration and exploitation. We then develop a mutation rate heuristic that of- fers a better balance by assigning different mutation rates to dif- ferent subpopulations. When the mutation rate is assigned through a ranking of the population, according to fitness for example, we call the resulting algorithm a Rank GA. We show how this Rank GA overcomes the limitations of other heuristics on a set of model problems showing under what circumstances it might be expected to outperform a SGA with any choice of mutation rate. Index Terms—Genetic algorithms, optimization methods, search methods. I. INTRODUCTION T HERE has always been a strong interest in evolutionary computation (EC) with respect to what are “optimal” pa- rameter settings for a given class of evolutionary algorithms (EAs) which has given rise to a correspondingly large literature (see for example [1] for a recent overview). Of course, there are many parameters that potentially affect the performance of an EA—representation, fitness function, population size, genetic operators, and their corresponding probabilities to name just a few. In this paper, we will restrict attention to the area of varia- tion operators and their associated probabilities for genetic algo- rithms (GAs), concentrating especially on the mutation operator and its associated probabilities. Of course, even for this restricted class, the literature is large (see, for example, [2], [3]). Methods for setting the associated parameters can be divided up into two types—tuning and con- trol. With tuning, the parameter is set before the run, while with control it is adjusted during a run. Parameter tuning for Manuscript received April 5, 2008; revised April 30, 2008. First published September 26, 2008; current version published April 01, 2009. This work was supported by the UNAM macroproyecto “Tecnologias para la Universidad de la Información y la Computación.” The work of J. Cervantes was supported by CONACYT through a doctoral fellowship, COMECYT though a thesis grant, and PCIC-UNAM. J. Cervantes is with the Instituto de Investigación en Matemáticas Aplicadas y en Sistemas, UNAM Circuito Exterior s/n Ciudad Universitaria, Mexico City, México D.F. 04510, and also with the Universidad Autónoma Metropolitana Artificios s/n Del. Álvaro Obregón, Mexico City D.F. 01120, Mexico (e-mail: jorgecervanteso@aim.com). C. R. Stephens is with the Instituto de Ciencias Nucleares, UNAM Circuito Exterior, Mexico City D.F. 04510, Mexico (e-mail: stephens@nucleares.unam. mx). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TEVC.2008.927707 mutation in GAs has given rise to several well known heuris- tics, such as the heuristic [4], where is the string length; or the error threshold heuristic [5]–[7]. For parameter control, in [8], a taxonomy for classifying such techniques was introduced, dividing them into three subclasses—deterministic, adaptive and self-adaptive. The deterministic class of [8] is a bit of a misnomer, given that the parameter change can have a sto- chastic component. The subclass is essentially associated with parameter settings that receive no feedback from the population dynamics. This type of mutation rate setting has been exten- sively studied (see for example, [9]–[13]), often times with a prescribed dynamic similar to that of a cooling schedule in sim- ulated annealing. With adaptive schemes there is feedback from the search that is used to determine the direction and/or mag- nitude of a change in the mutation rate; the work of [14] and [15] being of particular relevance to our results. Finally, in the classification of [8] for control parameters, there is the class of self-adaptive schemes [9], [12], [13] where parameter values are encoded into the representation and are subject to evolution. With any of the above parameter setting methods, it is of course important to try and understand under what cir- cumstances a given method might be expected to work well. This is the difficult part. Even at the level of the simplest heuristics, where the mutation rate is tuned, there have been widely varying recommended “optimal” rates. De Jong [16] recommended while Grefenstette [2] recommended , a factor of ten difference! These heuristics are completely universal in that the recommended values do not depend on anything else, neither the representation, nor the fitness function or the population. The next highest degree of universality comes from the heuristic, which depends only on the representation via the chromosome length, but not on the fitness landscape. In this sense, such universal heuristics are not optimized to a particular problem. Rather, it is hoped that they give reasonable and robust performance across a wide set of landscapes. The error threshold is less universal, being associated with what is considered to be an optimal balance between explo- ration and exploitation. Theoretically, it is determined by solving for the fixed point of the evolution in the approxi- mation that back mutations are neglected. The example of a “needle-in-a-haystack” landscape is treated in Section III-A1. It can also be characterized phenomenologically. Essentially, it is a critical mutation rate above which mutation produces so many errors in fit genotypes that selection is not capable of maintaining the majority of the population at a fitness peak. Instead much of the population spreads out searching the rest of the space. This heuristic does depend on the fitness landscape as generally the higher the degree of selection the higher the corresponding error threshold. A problem with using the error 1089-778X/$25.00 © 2008 IEEE