1 Predicting Dominance Rankings for Score-based Games Spyridon Samothrakis Diego Perez Philipp Rohlfshagen Simon M. Lucas Abstract—Game competitions may involve different player roles and be score-based rather than win/loss based. This raises the issue of how best to draw opponents for matches in ongoing competitions, and how best to rank the players in each role. An example is the Ms Pac-Man vs Ghosts Competition which requires competitors to develop software controllers to take charge of the game’s protagonists: participants may develop software controllers for either or both Ms Pac-Man and the team of four ghosts. In this paper we compare two ranking schemes for win-loss games, Bayes Elo and Glicko. We convert the game into one of win/loss (“dominance”) by matching controllers of identical type against the same opponent in a series of pair-wise comparisons. This implicitly creates a “solution concept” as to what a constitutes a good player. We analyse how many games are needed under two popular ranking algorithms, Glicko and Bayes Elo, before one can infer the strength of the players, according to our proposed solution concept, without performing an exhaustive evaluation. We show that Glicko should be the method of choice for online score-based game competitions. I. I NTRODUCTION Games provide excellent test-beds in which to develop, test and compare novel techniques in computational intelligence (CI). In the past, board games have served this purpose, with famous examples including Chess, Checkers and Othello. More recently, video games have attracted the attention of researchers both as a test-bed for and an application of CI methods. Video games typically offer a more visceral challenge compared to the cerebral appeal of board games, but are equally interesting from a machine intelligence point of view: arcade games such as Ms Pac-Man have been developed to be engaging, and the variety of human and computer-based opponents provides a robust way to test the efﬁcacy of new algorithms. The popularity of many games also makes them a useful tool in education to convey complex subject matters in an interactive and entertaining fashion. A vital aid is game competitions, as they allow researchers to test and evaluate their algorithms easily and under the exact same conditions. Competitions are also useful to attract a fresh cohort of students, researchers and game enthusiasts to the area. Within the IEEE Computational Intelligence Society, the Games Technical Committee has nurtured many interesting competitions that have leveraged existing video games or reasonably faithful implementations of them. In a similar fashion, the Game Intelligence Group at the University of Essex has organised numerous game competitions in recent years, attracting an ever-increasing number of participants from academia as well as the private sector. The most popular Spyridon Samothrakis, Diego Perez and Simon M. Lucas (School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, United Kingdom; email: {ssamot, dperez,sml}@essex.ac.uk); Philipp Rohlfshagen (Schneider Electric, Adelaide 5000, Australia; email: philipp.rohlfshagen@schneider-electric.com) of these competitions is the Ms Pac-Man vs Ghosts Com- petition [29] which has been running since 2011 with two iterations each year. The competitions requires competitors to create software controllers for either (or both) Ms Pac-Man and the ghosts that interface directly with the game. Competitors may submit and re-submit entries at any time prior to the deadline. Previously, all submissions would compete with one another in a round-robin tournament to establish the best controllers: Ms Pac-Man controllers attempt to maximise the score of the game while the ghosts strive to minimise the score; entries were ranked according to their total average score. As the competition grew in size the round-robin format became increasingly time-consuming and eventually in- feasible. The score-based evaluation exhibited some unwanted artefacts that occasionally would favour unwanted behaviours. For instance, it was possible for a controller to rank high by playing extremely well against a (small) subset of opponents while performing only averagely against the rest. Luckily, a wealth of alternative rating and ranking schemes exist: the more sophisticated of which compute a skill rating for each player that is updated whenever a new game outcome (win, loss or draw) has been established. The rating of a player changes in proportion to the skill of its opponent. Unfortu- nately these rating systems are not directly applicable to the Ms Pac-Man vs Ghosts Competition as Ms Pac-Man is a score- based game. Furthermore, the controllers involved in each comparison are heterogeneous: while Ms Pac-Man competes against the ghosts in each game played, it is the comparison of controllers of the same types that establishes the rankings. Furthermore, the game and objective is substantially different for Ms Pac-Man compared to the ghosts. It is nevertheless possible to take these factors into consideration and to use a rating scheme such as Glicko [15] for this competition. To the best of our knowledge, this is the ﬁrst application of win/loss schemes to an asymmetric score-based game. We compare Bayes Elo and Glicko, two popular ranking algorithms in the context of online game competitions. The remainder of this paper is structured as follows: Section II introduces the game of Ms Pac-Man and the Ms Pac-Man vs Ghosts Competition, which forms the bulk of our data. This is followed in Section III by a brief review of ranking schemes and other gaming competitions. These two sections form the background information of this paper. In Section IV we introduce the methodological choices made in this paper. An experimental comparison is subsequently presented in Section V, followed by conclusions and a discussion of prospects for future work in Section VI. II. MS PAC-MAN VS GHOSTS COMPETITION The Game Intelligence Group at the University of Essex has been running game competitions since 2007, including the Ms