Abstract—Human beings learn to do a task and then go on to learn some other task. However, they do not forget the previous learning. If need arises, they can call upon their previous learning and do not have to relearn from scratch again. In this paper, we build upon our earlier work in which we presented a mechanism for learning multiple tasks in a dynamic environment where the tasks can change arbitrarily without any warning to the learning agents. The main feature of the mechanism is that a percentage of the learning agents is periodically made to reset its previous learning and restart learning again. Thus, there is always a sub-population which can learn the new task, whenever there is a task change, without being hampered by previous learning. The learning then spreads to the other members of the population also. In our current work we experiment with the incorporation of archive for preserving those strategies which have performed well. The strategies in the archive are tested time to time in the current environment. If the current task is the same as the task for which the strategy was first discovered, then that strategy rapidly comes in vogue for the whole population. The criteria by which strategies are selected for storage in the archive, the deletion of some strategies because the archive has limited space and the mechanism for selecting strategies for utilization in the current environment are presented. I. INTRODUCTION OR modeling purposes we assume that human behavior is flexible and when presented with an environment, human beings sooner or later learn not only to survive but even thrive in it. Environments change with time and the strategies which are efficient in the old environment become ineffective. Usually this change is subtle and slow, which allows even the slow learners to deal with it. However, sometimes it is relatively fast (e.g. switching from communism to capitalism, or the colonization of a region by people of a totally different culture). In the face of such abrupt changes, the society reacts with some elements of the population learning relatively fast the new rules of successful living. Seeing their success the others try to copy the successful strategies and after some time the whole population adapts. Those who are resistant display a miserable performance and are sooner or later weeded out. Sometimes history repeats itself and an old environment Manuscript received November 14, 2008. This work was supported in part by the Higher Education Commission, Pakistan. Hasan Mujtaba is a PhD student at the CS Department, National University of Computer & Emerging Sciences, Islamabad, Pakistan. Dr. A. Rauf Baig is a Professor at the CS Department, National University of Computer & Emerging Sciences, Islamabad, Pakistan (e-mail: rauf.baig@nu.edu.pk). presents itself again. In such cases, the old experiences related to the previous occurrence of that environment prove to be very useful. This human learning behavior can be modeled as an environment which changes from time to time, and which sometimes reverts back to one of its previous forms. We have previously reported our experiments on the problem of learning in an environment where the task can change arbitrarily without any warning and the agents are supposed to recover and learn the new task [1]. In this work we experiment with the notion of archives and address the question of how to populate the archive, and when to use a strategy present in the archive. If the strategies developed for the old environments are somehow available in an archive, we can periodically try them to see whether one of them is effective again in the current environment. For modeling purpose we use a simulated dynamic environment related to computer games. Computer games provide an excellent test bed for experimentation with new models, algorithms and methods and environments similar to ours have previously been used in many works, some of which are Flatland [2], NERO [3], Dead End [4], and Cellz [5]. In our environment, intelligent agents, propelled by an artificial neural network (ANN) based controller, are expected to perform a given task. The task can change abruptly without any warning. The learning is in the form of search for appropriate weights of the ANN controller which can make the agents to perform their task. The learning method provides for continuous learning and allows the agents to keep on learning and improving their task performing strategies while the task is same. It has a provision which keeps on weeding the low performing agents and replacing them with reinitialized ones. The re- initialization is both random and from an archive of previously discovered strategies. After a task change the performance of agents decreases because they are still following the strategy useful for the old task. The continuous weeding out provides an elegant and smooth way to force the agents to find new strategies following the task change. Our research has the potential to be useful in the research efforts for making the video games more interesting by the automatic evolution of agents to handle a new task. Rest of the paper is organized as follow. Section II covers some related work on this topic, and Section III covers our continuous learning algorithm and elaborates on how the agents learn to deal with the changes occurring in their tasks with and without the help of archive. Section IV presents the Retaining the Lessons from Past for Better Performance in a Dynamic Multiple Task Environment Hasan Mujtaba and A. Rauf Baig F 1049 978-1-4244-2959-2/09/$25.00 c  2009 IEEE