TOK: A meta-model and ontology for heterogenous
terminological, linguistic and ontological knowledge resources
Nizar Ghoula, Gilles Falquet and Jacques Guyot
ISI lab, Centre Universitaire d’Informatique
Information System Department, University of Geneva
Geneva, Switzerland
{Nizar.Ghoula, Gilles.Falquet, Jacques.Guyot}@unige.ch
Abstract—Documents are rich resources containing know-
ledge describing a specific domain. That’s why their processing
is a common task, which is based on the use of terminological
and ontological resources. Various types of ontologies, thesauri,
and a large list of resources are commonly used in the
process of knowledge extraction. The modeling and reuse of
these resources is intended to support knowledge management.
In this paper, we propose a methodology and a model for
ontological and terminological resource management. Our aim
is to build a resources repository that offers operations for
loading, storing, indexing, translating, generating and matching
different resources. In this contribution we propose an ontology
as a model of these resources and we explain how can we
represent, annotate and load new resources into our repository.
Keywords-Ontology of Resources; Multilingual; Termi-
nology; Alignment; Resources Repository.
I. I NTRODUCTION
Many tasks related to documents, such as indexing, re-
trieving, annotation, or translation are based on linguistic,
terminological and ontological knowledge. This knowledge
currently exists in resources of different types such as
terminologies, glossaries, ontologies, multilingual dictionar-
ies or parallel text corpuses. These resources are represented
using various formalisms and languages (predicate logic,
description logic, semantic networks, conceptual graphs,
documents, etc.). Producing or finding linguistic, termino-
logical and ontological knowledge resources is not a simple
task, it is generally difficult to find the right resources for the
right usage. Some ontology repositories have been created
to offer a more effective indexing for these resources than
common search engines. For example, Swoogle
1
indexes
approximately 10 000 ontologies; the DAML site
2
provides
search based on ontology components (classes, properties,
. . . ) or metadata (URI, funding source, . . . ); BioPortal
3
has
similar searching and browsing tools [1].
However, users need more than ontologies to perform
knowledge engineering tasks, then it is important to have
repositories offering access to more diverse resources in
1
http://swoogle.umbc.edu
2
http://www.daml.org/ontologies
3
http://bioportal.bioontology.org
different formats. Moreover, Available resources generally
do not fit exactly with the user needs. Thus the user
must be provided with tools to derive new resources from
existing ones. This derivation may involve operations such
as selecting a part of a resource, composing it with another
one, translating it to another language or representing it in
a different formalism.
In this contribution, we present our approach called TOK
(Terminological, Linguistic and Ontological Knowledge Re-
source Management). TOK is based on the principles of
the semantic web, metadata and ontologies to facilitate the
representation, storage and alignment of heterogeneous and
multilingual resources. In the first part of this paper we will
identify the kinds of heterogenous resources that we may
process and discuss some of the proposed resource’s mod-
els. Afterwards we will describe our approach of resource
representation structure, which is composed by different
levels. In the third part we will introduce our ontology of
resources, which is an implementation of our general model
of heterogenous resources. The final section describes the
storage of these resources in the repository and introduces
the next part of our work, which will be focused on the
identification and the definition of operations that can be
used in the processing of the stored resources.
II. KNOWLEDGE RESOURCES
A central point of our approach is to build a repository of
knowledge resources. This repository is a collection of het-
erogenous resources that are based on different formalisms
or models. In this paper we will focus on the ontology that
describes this repository and on the way we manage the
resource representation by means a resource representation
ontology.
A. Resources Identification
In the TOK resource model we distinguish two main cat-
egories of resources: autonomous and enrichment resources
Autonomous resources: these are resources whose ex-
istence is independent of any other resource, like ontologies,
thesauri, terminologies, corpuses, documents ....
2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
978-0-7695-4191-4/10 $26.00 © 2010 IEEE
DOI 10.1109/WI-IAT.2010.204
297