Sense-based Blind Relevance Feedback for Question Answering Matteo Negri ITC-irst, Centro per la Ricerca Scientifica e Tecnologica Via Sommarive 18, 38050 Povo (TN), Italy negri@itc.it ABSTRACT This paper addresses the problem of enhancing document retrieval under the speciﬁc restrictions posed by the Ques- tion Answering scenario. In particular, given an input ques- tion, we aim at deﬁning a reliable method for expanding its keywords with semantic information extracted from Word- Net (e. g. synonyms or hypernyms). This is a challeng- ing task, since it is intrinsically dependent on high quality disambiguation of natural language questions which so far has been out of the reach of state-of-the-art Word Sense Disambiguation tools. The proposed solution relies on a two-step access to the target document collection, and can be seen as a “sense-based ” relevance feedback. According to this technique, once the top d1,d2,...,dn documents have been retrieved using the question keywords, the most fre- quent senses of the question terms are considered instead of drawing for expansion the most relevant words that ap- pear within d1,d2,...,dn. Query enrichment is then carried out adding terms semantically related to these senses. Our experiments, carried out using part of the TREC-2003 fac- toid questions set and the AQUAINT corpus for document retrieval, demonstrate the viability of this approach. Prelim- inary results show that the application of Sense-based Rele- vance Feedback to the QA scenario can improve retrieval up to 7% in terms of answer-bearing documents obtained with the best performing expansion strategy. Keywords Question Answering, Word Sense Disambiguation, Query Expansion, WordNet 1. INTRODUCTION Dealing with the general problem of ﬁnding textual infor- mation that is relevant with respect to a particular user’s information need, Information Retrieval (IR) and Question Answering (QA) systems are faced with the challenges posed by language variability and word ambiguity. In both these frameworks, one of the major needs is to bridge the gap be- tween the query terms (the query space) and the actual form in which the sought-after information is stored in the target collection (the document space). Often, in fact, queries do not contain the same words that are used in the document space to represent concepts [18]. The impact of the mis- match between the query and the document space becomes particularly evident in the QA scenario, where (i) actual answers are required instead of relevant documents; and (ii) the target collection often contains a limited number of rele- vant text passages from which answers can be mined ([14] re- ports an average of 7.0/5.0/3.0 correct answer-bearing doc- uments per question at TREC-9/10/11 respectively). Since it is likely that correct answers will appear within doc- uments which have little or no similarity to the question key- words, point (i) makes a big diﬀerence between IR (whose purpose is just ﬁnding relevant documents with respect to a given topic statement) and the more reﬁned QA task. As a consequence, while IR techniques proved to be eﬀective at locating within large collections of documents those relevant to a user’s query, more ﬁne-grained techniques are required in QA. At the same time, point (ii) demands that these techniques not be too restrictive, since the performance of the retrieval module clearly represents an upper bound for the overall system’s performance (if a relevant document is missed, there is no way to ﬁnally return a correct answer at the end of the whole process). The modalities for query expansion with terms related (ei- ther semantically, as it is discussed in this paper, or in virtue of their co-occurrence tendency) to the words of an input query represent one of the many facets of the problem, and have become crucial issues both for the IR and QA commu- nities. However, while in IR the research in this direction has reached a certain degree of maturation, little has been done with respect to the speciﬁcities of the QA benchmark, where it is still not clear what kinds of expansion techniques are best suited for the task. Focusing on the speciﬁc issues raised by query expansion with semantically related terms in the QA scenario, our work aims at ﬁlling this lack. In this direction, we ana- lyze diﬀerent WordNet-based query enrichment techniques whose common underlying approach is a two-step access to the local document collection. This approach can be seen as a semantically oriented variant of Blind Relevance Feed- back (BRF). Like in BRF ([1], [12]), terms for expansion are drawn from the top d1,d2,...,dn documents obtained af- ter a ﬁrst search with the original query keywords. The novelty here introduced is that instead of taking as addi- tional expansion terms the most relevant words contained in d1,d2,...,dn, we consider the most frequent senses of the original question’s keywords. Experimenting with an unsu- pervised Word Sense Disambiguation (WSD) tool, we aim to demonstrate that the recognition of these senses within d1,d2,...,dn is a more feasible task than the disambiguation