In AAMAS 2007 Workshop on Adaptive and Learning Agents (ALAg 2007), pp. 34-39, Honolulu, HI, USA, May 2007. Learning Policy Selection for Autonomous Intersection Management Kurt Dresner and Peter Stone University of Texas at Austin Department of Computer Sciences Austin, TX 78712 USA {kdresner, pstone}@cs.utexas.edu ABSTRACT Few aspects of modern life inflict as high a cost on society as traf- fic congestion and automobile accidents. Current work in AI and Intelligent Transportation Systems aims to replace human drivers with autonomous vehicles capable of safely and efficiently navi- gating through the most hazardous city streets. Once such vehicles are common, interactions between multiple vehicles will be possi- ble. Traffic lights and stop signs, which were designed for human drivers, may no longer be the best method for intersection control. Previously, we made the case for a reservation-based intersection control mechanism designed for autonomous vehicles, but compat- ible with human drivers. Including human drivers allows incremen- tal deployability as well as support for those who drive for pleasure, but may result in significantly suboptimal performance, as human drivers may be present in dramatically varying proportions. In this paper, we develop a learning-based approach to determine which variant of the control mechanism will be most effective under given conditions, and then combine the resulting predictor with our mul- tiagent intersection management mechanism, enabling it to deter- mine when and how it should alter its configuration to best suit the current traffic conditions. Our extension is fully implemented and tested in simulation, and we provide experimental results demon- strating its efficacy. 1. INTRODUCTION With the average American wasting 46 hours per year in traffic and accidents sapping upwards of $230 billion from the US econ- omy annually, few activities take as high a financial or emotional toll on people as automobile travel [7, 9]. Intelligent Transportation Systems (ITS) is the field that focuses on integrating information technology with vehicles and transportation infrastructure to make travel safer and more efficient. Current work in AI and ITS aims to replace error-prone human drivers with autonomous vehicles ca- pable of safely and efficiently navigating the most hazardous and congested roadways. Once such vehicles become common, interactions between mul- tiple vehicles will also be possible. Traffic lights and stop signs, designed for human drivers, may no longer be the best method for Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’07, May 14–15, 2007, Honolulu, Hawaii, USA Copyright 2007 ACM X-XXXXX-XX-X/XX/XX ...$5.00. intersection control. We recently proposed a multiagent intersec- tion control mechanism for autonomous vehicles [3, 5]. This sys- tem accommodates human drivers, but only with a large constant efficiency penalty. If the proportion of human drivers decreases, the system cannot exploit the more favorable conditions. While al- luding to a mechanism for altering the configuration of the system online, we did not fully specify, implement, or experiment with it, nor did we provide any method for choosing an appropriate config- uration given the current traffic conditions. Even if such a method were to exist, measuring the current proportion of human drivers at an intersection would necessitate expensive infrastructure beyond that already required by the intersection control mechanism. In this paper, we make three main contributions. First, we fully specify a configuration switching mechanism, implement it in sim- ulation, and analyze its performance. Second, we demonstrate that a classifier trained on data from the multiagent communication pro- tocol can select the most appropriate configuration for the current traffic conditions without additional protocol or infrastructure re- quirements. Third, we integrate this classifier into our switching mechanism, producing a fully-implemented system that smoothly and efficiently alters its configuration to suit changing traffic pat- terns. Despite the lack of specialized sensory equipment, the per- formance of our system approaches that of an omniscient agent able to select the optimal configuration based on knowledge of the up- coming traffic conditions. 2. RESERVATION-BASEDINTERSECTION CONTROL In our 2004 paper, we make the case for a new type of inter- section control mechanism [3]. This mechanism, instead of com- municating with human drivers through lights, communicates di- rectly with the driver agents piloting autonomous vehicles. Driver agents “call ahead” to an agent stationed at the intersection, called an intersection manager, to reserve a region of space-time in the intersection. As part of the request, the driver agents include infor- mation about the physical qualities and capabilities of the vehicle, as well as a predicted arrival time, velocity, and desired direction of travel. The intersection manager, using an intersection control policy, decides whether or not to grant the driver agent’s request. Once a reservation is made, the driver agent must only enter the intersection in accordance with the parameters of the reservation. If the driver agent determines that this is not possible, it must either cancel the reservation or change the reservation. In scenarios com- prising only autonomous vehicles, such a system can vastly outper- form current intersection control mechanisms like stop signs and traffic lights. Vehicles using such a system on average experience much lower delay, which is increase in the time it takes for them to