Using NLP techniques to create legal ontologies in a logic programming based web information retrieval system Jos ´ e Saias and Paulo Quaresma Departamento de Inform ´ atica, Universidade de ´ Evora, 7000 ´ Evora, Portugal jsaias|pq@di.uevora.pt ABSTRACT Web legal information retrieval systems need the capability to reason with the knowledge modelled by legal ontologies. Using this knowledge it is possible to represent and to make inferences about the semantic content of legal documents. In this paper a methodology for applying NLP techniques to automatically create a legal ontology is proposed. The on- tology is defined in the OWL semantic web language and it is used in a logic programming framework, EVOLP+ISCO, to allow users to query the semantic content of the doc- uments. ISCO allows an easy and efficient integration of declarative, object-oriented and constraint-based program- ming techniques with the capability to create connections with external databases. EVOLP is a dynamic logic pro- gramming framework allowing the definition of rules for ac- tions and events. An application of the proposed methodology to the legal web information retrieval system of the Portuguese Attorney General’s Office is described. 1. INTRODUCTION Modern web legal information retrieval systems need the capability to represent and to reason with the knowledge modelled by legal ontologies. In fact, ontologies allow the definition of class hierarchies, object properties, and rela- tion rules, such as, transitivity or functionality. Using this knowledge it is possible to represent semantic objects, to as- sociate them with legal documents, and to make inferences about them. OWL (Ontology Web Language) is a language proposed by the W3C consortium (http://www.w3.org) to be used in the ”semantic-web” environment for the representation of on- tologies. This language is based in the previous DAML+OIL (Darpa Agent Markup Language - [13]) language and it is defined using RDF (Resource Description Framework - [8]). In this paper a methodology to automatically create an OWL ontology from a set of legal documents is proposed. The methodology is based on natural language processing techniques, namely, a syntactical parser and a semantic an- alyzer able to obtain a partial interpretation of the docu- ments. A preliminary version of this work aiming to create a daml+oil ontology was presented in [12]. This task has similarities with the the work of Boer et al. [5] in the context of the IST programme E-POWER and E- COURT. However, we do not intend to propose any kind of standard for legal ontologies; our aim is to define a method- ology to automatically create a base ontology from a specific set of legal documents. After the creation of the OWL legal ontology, documents are enriched with instances of legal classes and a logic pro- gramming based framework is used to support inferences over them. The logic programming framework is based on ISCO [1] and EVOLP [2].ISCO is a new declarative language implemented over GNU Prolog with object-oriented predi- cates, constraints and allowing simple connections with ex- ternal databases. EVOLP is a dynamic logic programming language that is able to describe actions and events, allowing the system to make inferences about events, user intentions and beliefs and to be able to have cooperative interactions. Section 2 describes the natural language processing tech- niques used to create the OWL ontology. Section 3 describes ISCO, the basic logic programming framework. Section 4 de- scribes EVOLP, the dynamic logic programming framework defined over ISCO and Prolog. Section 5 describes the in- teraction manager and section 6 provides a simple example. Finally, in section 7 some conclusions and future work are pointed out. 2. OWL ONTOLOGY CREATION The OWL ontology is created from the output of natural language processing tools: • Text syntactical parsing. The documents are analysed by the parser developed by E. Bick in the domain of the VISL project (http://visl.hum.sdu.dk/visl [4]). This parser is available for 21 different languages, namely ICAIL 2003 Workshop on Legal Ontologies & Web Based Legal Information Management