Phrase Detectives: A Web-based Collaborative Annotation Game Jon Chamberlain (University of Essex, Colchester, UK jchamb@essex.ac.uk) Massimo Poesio (University of Essex, Colchester, UK and Universit`a di Trento, Trento, Italy poesio@essex.ac.uk) Udo Kruschwitz (University of Essex, Colchester, UK udo@essex.ac.uk) Abstract: Annotated corpora of the size needed for modern computational linguis- tics research cannot be created by small groups of hand annotators. One solution is to exploit collaborative work on the Web and one way to do this is through games like the ESP game. Applying this methodology however requires developing methods for teach- ing subjects the rules of the game and evaluating their contribution while maintaining the game entertainment. In addition, applying this method to linguistic annotation tasks like anaphoric annotation requires developing methods for presenting text and identifying the components of the text that need to be annotated. In this paper we present the first version of Phrase Detectives (http://www.phrasedetectives.org), to our knowledge the first game designed for collaborative linguistic annotation on the Web. Key Words: Web-based games, distributed knowledge acquisition, object recogni- tion, social networking, anaphoric annotation, user interaction, XML, Semantic Web Category: H.5.2, I.2.5, I.2.6, I.2.7 1 Introduction Perhaps the greatest obstacle to progress towards systems able to extract se- mantic information from text is the lack of semantically annotated corpora large enough to be used to train and evaluate semantic interpretation methods. Recent efforts to create resources to support large evaluation initiatives in the USA such as Automatic Context Extraction (ACE), Translingual Information Detection, Extraction and Summarization (TIDES), and GALE are beginning to change this – but just at a point when the community is beginning to realize that even the 1M word annotated corpora created in substantial efforts such as Prop-Bank [Palmer et al., 2005] and the OntoNotes initiative [Hovy et al., 2006] are likely to be too small. Unfortunately, the creation of 100M-plus corpora via hand annota- tion is likely to be prohibitively expensive, as already realized by the creators of Proceedings of I-SEMANTICS ’08 Graz, Austria, September 3-5, 2008