Zero Intelligence Plus and Gjerstad-Dickhaut Agents for Sealed Bid Auctions A. J. Bagnall and I. E. Toft School of Computing Sciences University of East Anglia Norwich England NR4 7TJ {ajb,it}@cmp.uea.ac.uk Abstract The increasing prevalence of auctions as a method of conducting a variety of transactions has promoted inter- est in modelling bidding behaviours with simulated agent models. The majority of popular research has focused on double auctions, i.e. auctions with multiple buyers and sell- ers. In this paper we investigate agent models of sealed bid auctions, i.e. single seller auctions where each buyer sub- mits a single bid. We propose an adaptation of two learn- ing mechanisms used in double auctions, Zero Intelligence Plus (ZIP) and Gjerstad-Dickhaut (GD), for sealed bid auc- tions. The experimental results determine if a single agent adopting ZIP & GD bidding mechanisms is able to learn the known optimal strategy through experience. We experi- ment with two types of sealed bid auctions, first price sealed bid and second price sealed bid. Quantitive analysis shows that whilst ZIP agents learn a good strategy they do not learn the optimal strategy, whereas GD agents learn an op- timal strategy in first price auctions. 1. Introduction The increase in the level of Internet connectivity has al- lowed the WWW to become a hub for electronic trading places. Buyers and sellers are now able to trade in previ- ously inaccessible markets. Some of the important ques- tions facing market overseers and traders are: what are the optimal strategies for a given auction structure; how do agents learn the optimal strategy; and how does restric- tion of information prevent agents from learning a strategy? These questions have been addressed through auction the- ory [7, 12], field studies [8], experimental lab studies [9], and agent simulations [2, 4]. Recently there has been par- ticular interest in the study of agents for continuous double auctions (CDA) [2, 3, 5, 6, 10]. We adapt learning mechanisms developed for CDA to investigate agent architectures for sealed bid auctions. We broadly classify agent architectures in the following way: memory free agents, memory based agents and modelling agents. The simplest type of agent stores no explicit informa- tion about past auctions and simply reacts to the previ- ous auction outcomes. These so called memory free reac- tive agents have been examined extensively in [1, 2]. The second type of agent we call memory based agents. These agents store some historical information about auctions and adjust their strategy based on some estimate of a global pic- ture of auction outcomes. They are considered to be more sophisticated than memory free agents and have been used in [5, 10]. The third type of agent we consider to be a mod- elling agent, these agents also store information about past auctions. Rather than using the market information directly these agents form models of competitors behaviour to esti- mate the correct action or strategy (for example, see [6]). It is our belief that prior to examining agent behaviour in complex, dynamic multi-agent systems, any agent archi- tectures should be tested in learning environments where a known optimal strategy exists. In this paper we examine the success of a single adaptive agent in learning the opti- mal strategy when competing against a population of non- adaptive agents. We use the Private Values Model (PVM) for auctions because, under some constraints, there are provably optimal strategies. These strategies provide a met- ric with which we may assess the ability of an adaptive agent in learning a strategy. They also are an obvious choice of strategy for the non-adaptive agents. The rest of this paper is structured as follows: Section 2 describes the auction model and the simulation structure. Section 3 details how the memory free ZIP [2] algorithm has been adjusted for sealed bid auctions and assesses how well it performs in simulations. Our experiments demonstrate that the complexity of the problem is such that memory free agents learn a good, but suboptimal strategy. Section 4 de-