Improvement in Game Agent Control Using State-Action Value Scaling Leo Galway, Darryl Charles and Michaela Black School of Computing & Information Engineering University of Ulster at Coleraine Cromore Road, BT52 1SA United Kingdom Abstract. The aim of this paper is to enhance the performance of a reinforcement learning game agent controller, within a dynamic game environment, through the retention of learned information over a series of consecutive games. Using a variation of the classic arcade game Pac-Man, the Sarsa algorithm has been utilised for the control of the Pac-Man game agent. The results indicate the use of state- action value scaling between games played as successful in preserving prior knowledge, therefore improving the performance of the game agent when a series of consecutive games are played. 1 Introduction Digital games provide an interesting test-bed for machine learning research due to the characteristically non-deterministic, dynamic nature of their environments [1]. In particular, the dynamic environments presented by predator/prey style games offer the advantage of being easily decomposed into a finite set of states, each with an associated set of reward values [2]. In order to generate reactive and believable game agent behaviours the use of machine learning techniques is required. However, the effective use of such algorithms is restricted by a number of requirements including the necessity for game agent behaviours to be learned in response to a changing game environment [1], [3]. Subsequently, by incorporating prior knowledge about the learning task into the learning algorithm and knowledge representation used, the performance of the game agent may be improved [4], [5]. Although a large variety of techniques exist within the machine learning domain, reinforcement learning provides an approach to agent-based learning which focuses on an agent’s interactions with its environment [6]. As such, reinforcement learning provides a learning methodology appropriate for use within digital game environments and comprises a set of algorithms and techniques which involve learning a sequence of actions in order to maximize an accumulated discounted reward received from the environment over a period of time. Subsequently, a control policy can be learned, through an agent’s exploration and exploitation of the environment, without requiring explicit training from a domain expert [4], [6], [7], [8]. For a comprehensive discussion on reinforcement learning, please refer to [6]. Within the academic digital games research literature, reinforcement learning techniques have been applied to a variety of games in order to learn control policies for game agents. Research conducted includes the Sarsa(λ) generation of a near- optimal control policy for game agents in the fighting game “Tao Feng” [8], and both 155 ESANN'2008 proceedings, European Symposium on Artificial Neural Networks - Advances in Computational Intelligence and Learning Bruges (Belgium), 23-25 April 2008, d-side publi., ISBN 2-930307-08-0.