How Learning Can Affect the Course of Evolution in Dynamic Environments Reiji SUZUKI Takaya ARITA Graduate School of Human Informatics, Nagoya University Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan E-mail: {reiji, ari}@info.human.nagoya-u.ac.jp Abstract The Baldwin effect is known as interactions between learning and evolution, which suggests that individ- ual lifetime learning can influence the course of evo- lution without the Lamarckian mechanism. Our con- cern is to consider the Baldwin effect in dynamic en- vironments, especially when there is no explicit opti- mal solution through generations and it depends only on interactions among agents. We adopt the iterated Prisoner’s Dilemma as a dynamic environment and in- troduce phenotypic plasticity to strategies by using a meta-learning rule termed “Meta-Pavlov”. In this sim- ulation, the Baldwin effect was observed as follows: First, strategies with enough plasticity spread, which caused a shift from defective population to coopera- tive. Second, these strategies were replaced by the strategy [x00x], which has a modest amount of plas- ticity. Keywords: baldwin effect, learning and evolution, the iterated prisoner’s dilemma, artificial life. 1 Introduction There have been a lot of discussions about inter- actions between learning and evolution. The Baldwin effect [1] is one of them, which suggests that individual lifetime learning can influence the course of evolution without the Lamarckian mechanism. This effect has come to the attention recently not only of biologists, but also of the computer scientists with the evolution- ary simulation of Hinton and Nowlan [2]. Since Hinton and Nowlan, many studies have been conducted, most of which have discussed the effect on the assumption that environments are static and the optimal solution is fixed. However, as we see in the real world, learning could be more effective and utilized in dynamic environ- ments, because the flexibility of plasticity itself is ad- vantageous to adapt ourselves to the changing world. Therefore, it is very important to examine how learn- ing can affect the course of evolution in dynamic envi- ronments. Our objective is to clarify the function and the mechanism of the Baldwin effect in dynamic environ- ments focusing on balances between benefit and cost of learning, while most of the studies concerning the Baldwin effect have aimed at the static environments. In general, dynamic environments can be divided typ- ically into the following two types: the environments in which the optimal solution is changed as the envi- ronment changes, and the ones in which each agent’s fitness is decided by interactions with other agents. As the former type of environments, Anderson [3] quantitatively analyzed how learning affects evolution- ary process in the dynamic environment whose opti- mal solution changes through generations. Sasaki and Tokoro [4] studied the relationship between learning and evolution using a simple model, where individu- als learn to distinguish poison and food by modifying the connective weights of neural network. These stud- ies emphasized the importance of learning in dynamic environments. We adopted the iterated Prisoner’s Dilemma (IPD) as the latter type of environments, where there is no explicit optimal solution through generations and fit- ness of agents depends only on interactions among them. This paper describes the Baldwin effect briefly, explains our evolutionary model and discusses how this effect was observed in the evolutionary experiments. 2 Background The Baldwin effect explains interactions between learning and evolution by paying attention to bal- ances between benefit and cost of learning. The Bald- win effect consists of the following two steps (Turney, Whitley and Anderson [5]): In the first step, lifetime learning (phenotypic plasticity) gives individual agents chances to change their phenotypes. If the learned traits are useful for agents and make their fitness in- crease, they will spread in the next population. In the second step, if the environment is sufficiently stable, the evolutionary path finds innate traits that can re- place learned traits, because of the cost of learning.