ORIGINAL ARTICLE A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment Julio Javier Castillo Received: 22 April 2011 / Accepted: 10 June 2011 / Published online: 1 July 2011 Ó Springer-Verlag 2011 Abstract In this paper we explain how to build a rec- ognizing textual entailment (RTE) system which only uses semantic similarity measures based on WordNet. We show how the widely used WordNet-based semantic measures can be generalized to build sentence level semantic metrics in order to be used in both mono-lingual and cross-lingual textual entailment. We experiment with a wide variety of RTE datasets and evaluate the contribution of an algorithm which expands the RTE monolingual corpus. Results achieved with this method yielded significant statistical differences when predicting RTE test sets. We provide an efficiency analysis of these metrics drawing some conclu- sions about their practical utility in recognizing textual entailment. We also analyze the cross-lingual textual entailment task, we create a bilingual English–Spanish corpus, and propose a procedure to create a cross-lingual textual entailment corpus for any pair of languages. Finally, we show that the proposed method is enough to build an average score RTE system in both monolingual and cross-lingual textual entailment, that uses semantic information from WordNet as the only source of lexical- semantic knowledge. Keywords Recognizing textual entailment Cross- lingual textual entailment WordNet Expand corpus Semantic measures Machine learning 1 Introduction The objective of the recognizing textual entailment (RTE) task [1] is determining whether or not the meaning of a ‘‘hypothesis’’ (H) can be inferred from a ‘‘text’’ (T). Thus, we say that ‘‘T entails H’’, if a person reading T would infer that H is most likely true. The two-way RTE task consists of deciding whether: T entails H, in which case the pair will be marked as ‘‘Entailment’’, otherwise the pair will be marked as ‘‘No Entailment’’. This definition of entailment is based on (and assumes) average human understanding of language as well as average background knowledge, as it can be seen in the following example (pair id = 33, RTE3 dataset). T = ‘‘As leaders gather in Argentina ahead of this weekends regional talks, Hugo Cha ´vez, Venezuela’s pop- ulist president, is using an energy windfall to win friends and promote his vision of 21st-century socialism.’’ H = ‘‘Cha ´vez is a follower of socialism.’’ Recently the RTE4 Challenge has changed to a three- way task (classification task in three classes) that consists in distinguishing among ‘‘Entailment’’, ‘‘Contradiction’’ and ‘‘Unknown’’ when there is no information to accept or reject the hypothesis. In this paper, which is a expanded version of the paper [2], we address the RTE problem by using a machine learning approach. All feature sets are WordNet-based, aimed at measuring the benefit of WordNet as a knowledge resource to the RTE task. Thus, we tested the classifiers most widely used by other researchers, and showed how the training set could impact them. Several authors [3–5] among others have used Wordnet in textual entailment tasks. In [6] the authors showed that some basic WordNet-based measures seem to be enough to build an average score RTE system. We extend these J. J. Castillo (&) National University of Cordoba-FaMAF, Cordoba, Argentina e-mail: jotacastillo@gmail.com J. J. Castillo National Technological University-FRC, Cordoba, Argentina 123 Int. J. Mach. Learn. & Cyber. (2011) 2:177–189 DOI 10.1007/s13042-011-0026-z