Translation- and projection-based unsupervised coreference resolution for Polish ⋆ Maciej Ogrodniczuk Institute of Computer Science, Polish Academy of Sciences Abstract. Creating a coreference resolution tool for a new language is a challenging task due to substantial eﬀort required by development of associated linguistic data, regardless of rule-based or statistical nature of the approach. In this paper, we test the translation- and projection-based method for an inﬂectional language, evaluate the result on a corpus of general coreference and compare the results with state-of-the-art solu- tions of this type for other languages. 1 Introduction A widely known problem of coreference resolution — the process of “determining which NPs in a text or dialogue refer to the same real-world entity” [1], crucial for higher-level NLP applications such as text summarisation, text categorisa- tion and textual entailment — has so far been tackled from many perspectives. However, there still exist languages which do not have state-of-the-art solutions available, which is most likely caused by the substantial eﬀort required by de- velopment of language resources and tools, some of them knowledge-intensive, either leading to development of language-speciﬁc rules or preparation of training data for statistical approaches. One of the solutions to this problem is following the translation-projection path, i.e., (1) translating the text (in the source language) to be coreferentially annotated into the target language, for which coreference resolution tools are available, (2) running the target language coreference resolver, (3) transferring the produced annotations (mentions — discourse world entities and clusters — sets of mentions referring to the same entity) from the target to the source language. Such a solution has so far been proposed e.g. by Rahman and Ng [2] and evaluated for Spanish and Italian with projection from English (see Section 2). Although the source and target languages in this setting come from two diﬀerent language families, they diﬀer markedly from inﬂectional languages such as Polish, which makes the approach interesting to test with diﬀerent language pairs. ⋆ The work reported here was carried out within the Computer-based methods for coref- erence resolution in Polish texts (CORE) project ﬁnanced by the Polish National Science Centre (contract number 6505/B/T02/2011/40) and University Research Program for Google Translate.