798 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011 Ontology Extraction for Knowledge Reuse: The e-Learning Perspective Matteo Gaeta, Member, IEEE, Francesco Orciuoli, Stefano Paolozzi, and Saverio Salerno Abstract—Ontologies have been frequently employed in order to solve problems derived from the management of shared dis- tributed knowledge and the efficient integration of information across different applications. However, the process of ontology building is still a lengthy and error-prone task. Therefore, a num- ber of research studies to (semi-)automatically build ontologies from existing documents have been developed. In this paper, we present our approach to extract relevant ontology concepts and their relationships from a knowledge base of heterogeneous text documents. We also show the architecture of the implemented system and discuss the experiments in a real-world context. Index Terms—E-learning, knowledge acquisition, ontology ex- traction, ontology learning. I. I NTRODUCTION T HE INFORMATION and communication technology community widely acknowledges the importance and use- fulness of domain ontologies, particularly in relation to Se- mantic Web applications [4]. However, the promises of the Semantic Web are still far from being fully implemented. In this scenario, a critical issue is ontology building that includes identifying, defining, and entering concept definitions and their relationships. Indeed, in large complex application domains, this task can be lengthy, costly, and controversial, particularly because people can have different points of view about the same concept. Therefore, finding (semi-)automatic algorithms to extract ontology concepts from existing knowledge bases represents an important activity. However, most approaches have “only” considered one step in the overall ontology en- gineering process, for example, the acquisition of concepts, the establishment of a concept taxonomy, or the discovering of conceptual relationships, whereas one must consider the overall process when building real-world applications. For this purpose, efforts have been made to facilitate the ontology engineering process, particularly the acquisition of ontologies from domain texts. In this paper, we describe our approach for ontology ex- traction from an existing knowledge base of heterogeneous documents. We also show an implementation of the proposed Manuscript received December 5, 2008; accepted March 25, 2009. Date of publication May 12, 2011; date of current version June 21, 2011. This paper was recommended by Editor W. Pedrycz. M. Gaeta, F. Orciuoli, and S. Salerno are with the Centro di Ricerca in Matematica Pura ed Applicata, Dipartimento di Ingegneria dell’Informazione e Matematica Applicata, University of Salerno, 84084 Fisciano, Italy (e-mail: gaeta@diima.unisa.it; orciuli@diima.unisa.it; salerno@unisa.it). S. Paolozzi is with the Dipartimento di Informatica e Automazione, Roma Tre University, 00146 Rome, Italy (e-mail: stefano.paolozzi@gmail.com). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMCA.2011.2132713 approach in the context of e-learning and present the experi- mental evaluation. A. Background and Related Works In the literature, the area of studies addressed by this paper is called ontology extraction or ontology learning. These terms mean the process of extracting ontological representations start- ing from extensive amount of unstructured text. Compared to the more general information extraction, ontology learning focuses on concepts and relationships between concepts. Two main approaches have been developed to aid ontology learning. The first one facilitates manual ontology engineer- ing by providing natural language processing tools, including editors, consistency checkers, mediators to support shared de- cisions, and ontology import tools. The second approach is based on machine learning and automated language processing techniques to extract concepts and ontological relations from structured and unstructured data such as databases and texts. Few systems exploit both approaches. The first approach is predominant in most developed tools such as KAON [36], Protégé [1], Chimaera [22], and many others, but some systems also implement machine learning techniques. In recent years, there has been an increasing awareness of the potential value of ontologies accompanied by a growing realization of the effort required to manually develop them. As a consequence, there are a lot of research studies which focus on the development of techniques through which ontological knowledge might be extracted from existing sources. A number of systems have been proposed for ontology ex- traction from text. We describe some of them in the following. ASIUM [12] extracts verb frames and taxonomic knowledge, based on statistical analysis of syntactic parsing of texts. Text-To-Onto [18] combines machine learning approaches with basic linguistic processing such as tokenization or lemma- tization and shallow parsing. It is based on the General Ar- chitecture for Text Engineering (GATE) framework [8]. The Text-To-Onto system defines a common framework into which extraction and maintenance mechanisms may be easily managed. OntoLearn [25], [35] is partially supported by the INTEROP Network of Excellence. The main task performed by OntoLearn is semantic disambiguation. Semantic disambiguation is per- formed using a method called structural semantic interconnec- tion, an approach to pattern recognition, that uses graphs to describe the objects to analyze (word senses) and a context-free grammar to detect common semantic patterns between graphs. OntoLT [29] extracts ontology concepts by term extraction through statistical methods and definition of linguistic patterns as well as convenient mappings to ontological structures. 1083-4427/$26.00 © 2011 IEEE