Analyzing Complex Strategic Interactions in Multi-Agent Systems William E. Walsh Rajarshi Das Gerald Tesauro Jeffrey O. Kephart IBM T.J. Watson Research Center 19 Skyline Drive, Hawthorne, NY 10532, USA wwalsh1, rajarshi, gtesauro, kephart @us.ibm.com Abstract We develop a model for analyzing complex games with re- peated interactions, for which a full game-theoretic analy- sis is intractable. Our approach treats exogenously specified, heuristic strategies, rather than the atomic actions, as primi- tive, and computes a heuristic-payoff table specifying the ex- pected payoffs of the joint heuristic strategy space. We ana- lyze a particular game based on the continuous double auc- tion, and compute Nash equilibria of previously published heuristic strategies. To determine the most plausible equi- libria, we study the replicator dynamics of a large population playing the strategies. To account for errors in estimation of payoffs or improvements in strategies, we analyze the dynam- ics and equilibria based on perturbed payoffs. Introduction Understanding complex strategic interactions in multi-agent systems is assuming an ever-greater importance. In the realm of agent-mediated electronic commerce, for exam- ple, authors have recently discussed scenarios in which self- interested software agents execute various dynamic pricing strategies, including posted pricing, bilateral negotiation, and bidding. Understanding the interactions among various strategies can be extremely valuable, both to designers of markets (who wish to ensure economic efficiency and sta- bility) and to designers of individual agents (who wish to find strategies that maximize profits). More generally, by de- mystifying strategic interactions among agents, we can im- prove our ability to predict (and therefore design) the overall behavior of multi-agent systems—thus reducing one of the canonical pitfalls of agent-oriented programming (Jennings & Wooldridge 2002). In principle, the (Bayes) Nash equilibrium is an appropri- ate concept for understanding and characterizing the strate- gic behavior of systems of self-interested agents. In prac- tice, however, it is infeasible to compute Nash equilibria for all but the very simplest interactions. For some types of repeated interactions, such as continuous double auc- tions (Rust, Miller, & Palmer 1993) and simultaneous as- cending auctions (Milgrom 2000), even formulating the in- formation structure of the extensive-form game, much less computing the equilibrium, remains an unsolved problem. Copyright c 2002, American Association for Artificial Intelli- gence (www.aaai.org). All rights reserved. Given this state of affairs, it is typical to endow agents with heuristic strategies, comprising hand-crafted or learned decision rules on the underlying primitive actions as a function of the information available to an agent. Some strategies are justified on the basis of desirable properties that can be proven in simplified or special-case models, while others are based on a combination of economic intu- ition and engineering experience (Greenwald & Stone 2001; Tesauro & Das 2001). In this paper, we propose a methodology for analyzing complex strategic interactions based on high-level, heuristic strategies. The core analytical components of our methodol- ogy are Nash equilibrium of the heuristic strategies, dynam- ics of equilibrium convergence, and perturbation analysis. Equilibrium and the dynamics of equilibrium conver- gence have been widely studied, and our adoption of these tools is by no means unique. Yet, these approaches have not been widely and fully applied to the analysis of heuris- tic strategies. A typical approach to evaluating the strate- gies has been to compare various mixtures of strategies in a structured or evolutionary tournament (Axelrod 1997; Rust, Miller, & Palmer 1993; Wellman et al. 2001), often with the goal of establishing which strategy is the “best”. Sometimes, the answer is well-defined, as in the Year 2000 Trading Agent Competition (TAC-00), in which the top sev- eral strategies were quite similar, and clearly superior to all other strategies (Greenwald & Stone 2001). In other cases, including recent studies of strategies in continuous double auctions (Tesauro & Das 2001) and in TAC-01, there does not appear to be any one dominant strategy. The question of which strategy is “best” is often not the most appropriate, given that a mix of strategies may consti- tute an equilibrium. The tournament approach itself is often unsatisfactory because it cannot easily provide a complete understanding of multi-agent strategic interactions, since the tournament play is just one trajectory through an essentially infinite space of possible interactions. One can never be cer- tain that all possible modes of collective behavior have been explored. Our approach is a more principled and complete method for analyzing the interactions among heterogeneous heuris- tic strategies. Our methodology, described more fully in the following sections, entails creating a heuristic-payoff table—an analog of the usual payoff table, except that the