ItalWordNet: extending and exploiting an existing resource for computational tasks Ornella CORAZZARI Consorzio Pisa Ricerche Via S. Maria 40 Pisa 56100 - ITALY corazzar@ilc.pi.cnr.it Antonietta ALONGE Istituto di Linguistica Università di Perugia Perugia 06100 - ITALY antoalonge@libero.it Francesca BERTAGNA Consorzio Pisa Ricerche Via S. Maria 40 Pisa 56100 - ITALY F.Bertagna@ilc.pi.cnr.it Nicoletta CALZOLARI Istituto di Linguistica Computazionale, CNR Area della Ricerca di Pisa Via Alfieri 1, Loc. S. Cataldo Ghezzano 56010 (PI) - ITALY glottolo@ilc.pi.cnr.it Adriana ROVENTINI Istituto di Linguistica Computazionale, CNR Area della Ricerca di Pisa Via Alfieri 1, Loc. S. Cataldo Ghezzano 56010 (PI) - ITALY adriana@ilc.pi.cnr.it Abstract In this paper we discuss how the ItalWordNet semantic database, being built by extending the Italian wordnet developed within the EuroWordNet project, is being exploited for the lexical semantic annotation of a corpus of Italian. Introduction Within the framework of a National Italian Project 1 , we are building a lexical-semantic database, ItalWordNet (IWN), by extending the Italian wordnet developed within the EuroWordNet project (Vossen, 1999). In this paper we firstly briefly provide basic information on IWN. We then discuss its usefulness and limits to perform lexical semantic annotation of a subset of a corpus composed of fragments taken from some Italian newspapers and magazines. 1 SI-TAL (‘Integrated System for the Automatic Treatment of Language’) is a National Project, coordinated by Antonio Zampolli at the ‘Consorzio Pisa Ricerche’, aiming at developing large linguistic resources and software tools for the Italian written and spoken language processing. 1 Main characteristics of IWN The IWN database is constituted by: i) a generic wordnet containing about 64,000 word senses corresponding to about 49,000 synsets; ii) a (generic) Interlingual-Index (ILI) which is an unstructured version of WordNet 1.5, also used in EWN to link wordnets of different languages; iii) a terminological wordnet, containing about 5,000 synsets of the economic-financial domain 2 ; iv) a terminological ILI, to which the terminological wordnet is linked; v) the Top Ontology, a hierarchy of language- independent concepts, built within EWN and partially modified in IWN to account for adjectives (Alonge et al., 2000). Via the ILIs, all the concepts in the generic and specific wordnets are directly or indirectly linked to the TO; vi) the Domain Ontology, containing a set of domain labels. Via the ILIs, all the concepts in the generic and specific wordnets are directly or indirectly linked to the DO. Although we have basically used the EWN model of lexical-semantic relations (see Alonge et 2 Developed at IRST (Istituto per la Ricerca Scientifica e Tecnologica), Trento, Italy.