A Game Theoretic Approach to Autonomous Two-Player Drone Racing Riccardo Spica, Davide Falanga, Eric Cristofalo, Eduardo Montijano, Davide Scaramuzza, Mac Schwager Abstract—To be successful in multi-player drone racing, a player must not only follow the race track in an optimal way, but also avoid collisions with the opponents. Since unveiling one’s own strategy to the adversaries in not desirable, this requires each player to independently predict the other players future actions. Nash equilibria are a powerful tool to model this and similar multi-agent coordination problems in which the absence of communication impedes full coordination between the agents. In this paper, we propose a novel receding horizon planning algorithm that, exploiting sensitivity analysis within an iterated best response computational scheme, can approximate Nash equilibria in real time. The planner only requires that each player knows its own position (e.g. with GPS or SLAM), and can sense the other player’s relative position (e.g. with on board vision). Our solution is demonstrated to effectively compete against alternative strategies in a large number of drone racing simulations. Hardware experiments with onboard vision sensing prove the practicality of our strategy. I. INTRODUCTION Drone racing competitions have recently become popular both within the scientific community [1] and the general pub- lic [2]. Developing a fully autonomous racing drone, however, still remains challenging. Most of the past research has focused on a time trial style of racing: a single robot must complete a racing track in the shortest amount of time. This scenario poses a number of chal- lenges in terms of dynamic modeling, on-board perception, localization and mapping, trajectory generation and optimal control. Impressive results have been obtained in this context for a variety of autonomous vehicles such as cars [3], [4] motorcycles [5] and even sailboats [6]. Advanced autonomous navigation capabilities for autonomous UAVs, although not in a racing context, have also been demonstrated [7], [8]. Very little attention, on the other hand, has been devoted to the more classical multi-player style of racing, sometimes called rotocross among the drone racing enthusiasts. In ad- dition to the aforementioned challenges, this kind of race also requires direct competition with other agents, incorpo- rating strategic blocking, faking, and opportunistic passing while avoiding collisions. In addition to posing an interesting research problem, multi-player drone racing is then also a good testing ground for developing and testing more widely applicable non-cooperative multi-robot planning strategies. R. Spica, E. Cristofalo and M. Schwager are with the Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, USA rspica;ecristof;schwager@stanford.edu D. Falanga and D. Scaramuzza are with the Robotics and Perception Group, University of Zurich, Switzerland falanga;sdavide@ifi.uzh.ch. E. Montijano is with Centro Universitario de la Defensa and Instituto de Investigación en Ingenierìa de Aragón, Universidad de Zaragoza, Zaragoza 50018, Spain emonti@unizar.es. τ w τ p i τ i n i t i Figure 1: Representation of the race track used for the sim- ulations. The track is parameterized by its center line τ and its half width w τ . Given the current robot position p i , we can define a local track frame with origin τ i as the closest point to p i and with t and n being the local tangent and normal vectors to the track in τ i . In this paper we present a real-time planning algorithm for a drone to race competitively against another drone, exhibiting this kind of strategic intelligence. Motivated by the success obtained by Model Predictive Control (MPC) in the development of real-time optimal control schemes, we apply similar receding horizon control strategies in the context of multi-player drone racing. Differently from a standard MPC planner, however, our strategy also takes into account other agents reactions to the ego agent actions. This is achieved by employing an iterated best response computation scheme: each player alternatively solves an optimal control problem for each player while keeping the other player’s strategy constant. In addition to this, in order to fully capture and exploit the effects of the collision avoidance constraints, we also use sensitivity analysis to approximate the effects of one player’s actions on its opponent’s cost. Despite the fact that Nash equilibria are often difficult to achieve or verify in dynamic games, we prove that, if our algorithm converges, it converges to necessary conditions for a Nash equilibrium. In practice, we find that the algorithm does converge, providing a theoretical foundation for our technique. The algorithm also runs in real time, at 20Hz, on standard hardware. This work is focused on the competition and interaction between the drones, not on the perception and navigation of the race course or environment. Hence, we assume that each arXiv:1801.02302v1 [cs.RO] 8 Jan 2018