Lexical Entailment for Information Retrieval St´ ephane Clinchant, Cyril Goutte, and Eric Gaussier Xerox Research Centre Europe 6, chemin de Maupertuis, F-38240 Meylan, France Cyril.Goutte@xrce.xerox.com, etc. Abstract. Textual Entailment has recently been proposed as an ap- plication independent task of recognising whether the meaning of one text may be inferred from another. This is potentially a key task in many NLP applications. In this contribution, we investigate the use of various lexical entailment models in Information Retrieval, using the lan- guage modelling framework. We show that lexical entailment potentially provides a significant boost in performance, similar to pseudo-relevance feedback, but at a lower computational cost. In addition, we show that the performance is relatively stable with respect to the corpus the lexical entailment measure is estimated on. 1 Introduction Textual Entailment has recently been proposed [6] as an application indepen- dent task of recognising whether the meaning of one text may be inferred from another text. As such, it plays a role in a variety of Natural Language Processing applications such as Question Answering or Machine Translation. Textual entail- ment may also impact Information Retrieval (IR) in at least two ways. First the notion of relevance bears strong similarities with the one of entailment. Second, the notion of entailment may offer a way to capture non-obvious dependencies between query and documents which are not captured by simple word-based similarities. Although the general task of recognising textual entailment is potentially much broader, the practical probabilistic approach proposed for example in [10] relies on word based lexical probabilities, and amounts to assess whether one lexical unit is entailed by another one. This approach is reminiscent of the lexical statistics used to characterize semantic domains or topics in IR. Most lexical statistics studies rely on standard similarity measures in order to derive affinities between words, affinities that can then be used to (a) create thesauri that can in turn be used for indexing or query enrichment purposes, or (b) compute an extended similarity between documents and queries. Along the first line, one finds works pertaining to thesaurus construction, exemplified through , for example, the phrase construction procedure of [15] and the various similarity levels of [11]. Along the second line, one finds works that embed term similarities within the computation of the similarity between queries and documents. The generalized vector space model of [16], the similarity