TOK: A meta-model and ontology for heterogenous terminological, linguistic and ontological knowledge resources Nizar Ghoula, Gilles Falquet and Jacques Guyot ISI lab, Centre Universitaire d’Informatique Information System Department, University of Geneva Geneva, Switzerland {Nizar.Ghoula, Gilles.Falquet, Jacques.Guyot}@unige.ch Abstract—Documents are rich resources containing know- ledge describing a specific domain. That’s why their processing is a common task, which is based on the use of terminological and ontological resources. Various types of ontologies, thesauri, and a large list of resources are commonly used in the process of knowledge extraction. The modeling and reuse of these resources is intended to support knowledge management. In this paper, we propose a methodology and a model for ontological and terminological resource management. Our aim is to build a resources repository that offers operations for loading, storing, indexing, translating, generating and matching different resources. In this contribution we propose an ontology as a model of these resources and we explain how can we represent, annotate and load new resources into our repository. Keywords-Ontology of Resources; Multilingual; Termi- nology; Alignment; Resources Repository. I. I NTRODUCTION Many tasks related to documents, such as indexing, re- trieving, annotation, or translation are based on linguistic, terminological and ontological knowledge. This knowledge currently exists in resources of different types such as terminologies, glossaries, ontologies, multilingual dictionar- ies or parallel text corpuses. These resources are represented using various formalisms and languages (predicate logic, description logic, semantic networks, conceptual graphs, documents, etc.). Producing or finding linguistic, termino- logical and ontological knowledge resources is not a simple task, it is generally difficult to find the right resources for the right usage. Some ontology repositories have been created to offer a more effective indexing for these resources than common search engines. For example, Swoogle 1 indexes approximately 10 000 ontologies; the DAML site 2 provides search based on ontology components (classes, properties, . . . ) or metadata (URI, funding source, . . . ); BioPortal 3 has similar searching and browsing tools [1]. However, users need more than ontologies to perform knowledge engineering tasks, then it is important to have repositories offering access to more diverse resources in 1 http://swoogle.umbc.edu 2 http://www.daml.org/ontologies 3 http://bioportal.bioontology.org different formats. Moreover, Available resources generally do not fit exactly with the user needs. Thus the user must be provided with tools to derive new resources from existing ones. This derivation may involve operations such as selecting a part of a resource, composing it with another one, translating it to another language or representing it in a different formalism. In this contribution, we present our approach called TOK (Terminological, Linguistic and Ontological Knowledge Re- source Management). TOK is based on the principles of the semantic web, metadata and ontologies to facilitate the representation, storage and alignment of heterogeneous and multilingual resources. In the first part of this paper we will identify the kinds of heterogenous resources that we may process and discuss some of the proposed resource’s mod- els. Afterwards we will describe our approach of resource representation structure, which is composed by different levels. In the third part we will introduce our ontology of resources, which is an implementation of our general model of heterogenous resources. The final section describes the storage of these resources in the repository and introduces the next part of our work, which will be focused on the identification and the definition of operations that can be used in the processing of the stored resources. II. KNOWLEDGE RESOURCES A central point of our approach is to build a repository of knowledge resources. This repository is a collection of het- erogenous resources that are based on different formalisms or models. In this paper we will focus on the ontology that describes this repository and on the way we manage the resource representation by means a resource representation ontology. A. Resources Identification In the TOK resource model we distinguish two main cat- egories of resources: autonomous and enrichment resources Autonomous resources: these are resources whose ex- istence is independent of any other resource, like ontologies, thesauri, terminologies, corpuses, documents .... 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 978-0-7695-4191-4/10 $26.00 © 2010 IEEE DOI 10.1109/WI-IAT.2010.204 297