Approaching Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt Dept. of Computational Linguistics Saarland University Saarbr¨ ucken, Germany albu@coli.uni-sb.de Anette Frank Dept. of Computational Linguistics Saarland University & Language Technology Lab, DFKI GmbH Saarbr¨ ucken, Germany frank@coli.uni-sb.de Abstract We present a baseline system for model- ing textual entailment that combines deep syntactic analysis with structured lexi- cal meaning descriptions in the FrameNet paradigm. Textual entailment is approx- imated by degrees of structural and se- mantic overlap of text and hypothesis, which we measure in a match graph. The encoded measures of similarity are pro- cessed in a machine learning setting. 1 1 Introduction In this paper, we present a baseline system for ap- proaching the textual entailment task as presented in the PASCAL RTE Challenge. This task in- volves complex examples from unrestricted do- mains, a challenge for deep semantics-based pro- cessing. Similar to previous work (Dagan et al., 2005) we explore semantically informed approxi- mations of textual entailment. As shown by (Bos and Markert, 2005), fine-grained semantic analysis and reasoning models can yield high precision, but are severely restricted in recall. The architecture we present is open for extension to deeper methods. We assess the utility of approximating entail- ment in terms of structural and semantic overlap of text and hypothesis, combining wide-coverage LFG 1 This work has been carried out in the project SALSA, funded by the German Science Foundation DFG, title PI 154/9-2. We thank Katrin Erk and Sebastian Pado for providing and supporting the Fred and Rosy systems and Alexander Koller for his contributions and for implementing the FEFViewer. parsing with frame semantics, to project a lexical se- mantic representation with semantic roles. We com- pute various measures of overlap to train a machine learning model for entailment. In Section 2, we describe the linguistic resources and our system architecture. In Section 3, we present our approach for modeling similarity of text and hypothesis in a match graph. In Section 4, we report on our machine learning experiments, the re- sults in the RTE task, and provide some error anal- ysis, including discussion of typical examples that show the strength and weaknesses of our approach. We conclude with a discussion of perspectives. 2 Base Components and Architecture 2.1 Basic Analysis Components Our primary linguistic analysis components are the probabilistic LFG grammar for English developed at Parc (Riezler et al., 2002), and a combination of systems for frame semantic annotation : two probabilistic systems for frame and role annotation, Fred and Rosy (Erk and Pado, 2006) and a rule- based system for frame assignment, called Detour (to FrameNet) (Burchardt et al., 2005), which uses WordNet to address coverage problems in the cur- rent FrameNet data. In addition we use the Word Sense Disambiguation system (Banerjee and Peder- sen, 2003) and mappings from WordNet to SUMO (Niles and Pease, 2003) to assign WordNet synsets and SUMO ontological classes to main predicates. 2.2 Frame Semantics Frame Semantics (Baker et al., 1998) models the lexical meaning of predicates and their argument