798 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011
Ontology Extraction for Knowledge Reuse:
The e-Learning Perspective
Matteo Gaeta, Member, IEEE, Francesco Orciuoli, Stefano Paolozzi, and Saverio Salerno
Abstract—Ontologies have been frequently employed in order
to solve problems derived from the management of shared dis-
tributed knowledge and the efficient integration of information
across different applications. However, the process of ontology
building is still a lengthy and error-prone task. Therefore, a num-
ber of research studies to (semi-)automatically build ontologies
from existing documents have been developed. In this paper, we
present our approach to extract relevant ontology concepts and
their relationships from a knowledge base of heterogeneous text
documents. We also show the architecture of the implemented
system and discuss the experiments in a real-world context.
Index Terms—E-learning, knowledge acquisition, ontology ex-
traction, ontology learning.
I. I NTRODUCTION
T
HE INFORMATION and communication technology
community widely acknowledges the importance and use-
fulness of domain ontologies, particularly in relation to Se-
mantic Web applications [4]. However, the promises of the
Semantic Web are still far from being fully implemented. In
this scenario, a critical issue is ontology building that includes
identifying, defining, and entering concept definitions and their
relationships. Indeed, in large complex application domains,
this task can be lengthy, costly, and controversial, particularly
because people can have different points of view about the
same concept. Therefore, finding (semi-)automatic algorithms
to extract ontology concepts from existing knowledge bases
represents an important activity. However, most approaches
have “only” considered one step in the overall ontology en-
gineering process, for example, the acquisition of concepts,
the establishment of a concept taxonomy, or the discovering
of conceptual relationships, whereas one must consider the
overall process when building real-world applications. For this
purpose, efforts have been made to facilitate the ontology
engineering process, particularly the acquisition of ontologies
from domain texts.
In this paper, we describe our approach for ontology ex-
traction from an existing knowledge base of heterogeneous
documents. We also show an implementation of the proposed
Manuscript received December 5, 2008; accepted March 25, 2009. Date of
publication May 12, 2011; date of current version June 21, 2011. This paper
was recommended by Editor W. Pedrycz.
M. Gaeta, F. Orciuoli, and S. Salerno are with the Centro di Ricerca in
Matematica Pura ed Applicata, Dipartimento di Ingegneria dell’Informazione
e Matematica Applicata, University of Salerno, 84084 Fisciano, Italy (e-mail:
gaeta@diima.unisa.it; orciuli@diima.unisa.it; salerno@unisa.it).
S. Paolozzi is with the Dipartimento di Informatica e Automazione, Roma
Tre University, 00146 Rome, Italy (e-mail: stefano.paolozzi@gmail.com).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCA.2011.2132713
approach in the context of e-learning and present the experi-
mental evaluation.
A. Background and Related Works
In the literature, the area of studies addressed by this paper
is called ontology extraction or ontology learning. These terms
mean the process of extracting ontological representations start-
ing from extensive amount of unstructured text. Compared
to the more general information extraction, ontology learning
focuses on concepts and relationships between concepts.
Two main approaches have been developed to aid ontology
learning. The first one facilitates manual ontology engineer-
ing by providing natural language processing tools, including
editors, consistency checkers, mediators to support shared de-
cisions, and ontology import tools. The second approach is
based on machine learning and automated language processing
techniques to extract concepts and ontological relations from
structured and unstructured data such as databases and texts.
Few systems exploit both approaches. The first approach is
predominant in most developed tools such as KAON [36],
Protégé [1], Chimaera [22], and many others, but some systems
also implement machine learning techniques.
In recent years, there has been an increasing awareness of
the potential value of ontologies accompanied by a growing
realization of the effort required to manually develop them. As
a consequence, there are a lot of research studies which focus
on the development of techniques through which ontological
knowledge might be extracted from existing sources.
A number of systems have been proposed for ontology ex-
traction from text. We describe some of them in the following.
ASIUM [12] extracts verb frames and taxonomic knowledge,
based on statistical analysis of syntactic parsing of texts.
Text-To-Onto [18] combines machine learning approaches
with basic linguistic processing such as tokenization or lemma-
tization and shallow parsing. It is based on the General Ar-
chitecture for Text Engineering (GATE) framework [8]. The
Text-To-Onto system defines a common framework into
which extraction and maintenance mechanisms may be easily
managed.
OntoLearn [25], [35] is partially supported by the INTEROP
Network of Excellence. The main task performed by OntoLearn
is semantic disambiguation. Semantic disambiguation is per-
formed using a method called structural semantic interconnec-
tion, an approach to pattern recognition, that uses graphs to
describe the objects to analyze (word senses) and a context-free
grammar to detect common semantic patterns between graphs.
OntoLT [29] extracts ontology concepts by term extraction
through statistical methods and definition of linguistic patterns
as well as convenient mappings to ontological structures.
1083-4427/$26.00 © 2011 IEEE