Use of Domain Knowledge
in Resolving Pronominal Anaphora
Laudy E.H.M. ter Haar
Universiteit Twente
Ivana Korbayová
Charles University,
Prague
Toine Andernach
Universiteit Twente
Paul E. van der Vet
Universiteit Twente
Abstract. The research reported here has been conducted in the context of the Plinius
project, which aims at semi-automatic knowledge acquisition from short natural-
language texts. In this framework, a system has been developed for finding the
antecedents of pronominal anaphora, in particular 'it'- and 'its'- anaphora. The ana-
phora resolution module operates on parser output and can make use of information
generated by the parser; the lexicon gives the conceptual representations corresponding
to the words. The algorithm for anaphora resolution involves three steps: (i) Assemble:
construct a list of discourse entities (DEs); (ii) Identify: identify anaphoric DEs; (iii)
Select: select, for each anaphoric DE, another DE from the list of DEs as its antecedent.
The third step applies four constraints, i.e. rules to which a DE must conform in order to
be a valid candidate: (a) semantic type agreement; (b) number agreement; (c) projection
constraint; (d) conceptual compatibility. Constraints (a, b, c) are linguistic, while (d) is
domain-related. The algorithm has been tested on three texts. It turns out that applying
(d) before (a, b, c) considerably improves efficiency.
1. Introduction
1
Finding the antecedent of an anaphoric expression is generally held to take both
linguistic and more content-related clues. Making such clues explicit and using
them for effective and efficient resolution of anaphors is a problem for natural-
language processing systems. In the literature, attention has been directed
mainly at linguistic (syntactic and semantic) clues. Content-related clues are
often mentioned as being indispensible, but there is little practical experience
with them. From an engineering perspective, content-related clues appear
unattractive. Their implementation costs extra effort while, in the absence of
practical experience, their effectiveness is uncertain.
Belgian Journal of Linguistics 10 (1996), 11–35. DOI 10.1075/bjl.l0.03haa
ISSN 0774–5141 / E-ISSN 1569-9676 © John Benjamins Publishing Company