International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol.3, Issue.2, March-April. 2013 pp-985-989 ISSN: 2249-6645 www.ijmer.com 985 | Page Kirankumar Kataraki, 1 Sumana M 2 1 IV sem M.Tech/ Department of Information Science & Engg / M S Ramaiah Institute of Technology, Bangalore-54 2 Assistant Professor/Department of Information Science & Engg/ M S Ramaiah Institute of Technology, Bangalore-54 Abstract: Ontology Extraction play an important role in the Semantic Web as well as in knowledge management. The emergence of Semantic Web and the related technologies promise to make the Web a meaningful experience. Conversely, success of Semantic Web and its applications depends largely on utilization and interoperability of well-formulated ontology bases in an automated heterogeneous environment. Ontology is what exists in a domain and how they relate with each other. The advantage of an ontology is that it represents real world information in a manner that is machine understandable. This leads to a variety of interesting applications for the benefit of the target user groups. An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are critical for applications that need to search across or merge information from diverse communities. In this paper, we present our approach to extract relevant ontology concepts and their relationships from a knowledge base of heterogeneous text documents. Keywords: heterogeneous, knowledge, machine understandable, Ontology Extraction, Semantic Web I. Introduction The Semantic Web is a major research initiative of the World Wide Web Consortium (W3C) [1] to create a metadata-rich Web of resources that can describe themselves not only by how they should be displayed (HTML) or syntactically (XML), but also by the meaning of the metadata. We consider Semantic Web as next generation Web that provides great benefits in Web Services, Internet Commerce, and other promising application areas. However, Semantic Web is still in its primary stage means not fully implemented. and has lots of unsolved problems. One of the major problem is to extract data from heterogeneous documents in such way that it has to understand by machine, which we call ontology extraction. A basic approach for ontology extraction is by manual. Most of the current research focuses on exploiting various methods to generate ontology automatically or semi-automatically. Manual ontology building is a time consuming activity that requires a lot of efforts for knowledge domain acquisition and knowledge domain modeling. In order to overcome these problems many methods have been developed, including systems and tools that automatically or semi-automatically, using text mining and machine learning techniques, allows to generate ontologies. The research field which study this issues is usually called “ontology generation” or “ontology extraction” or “ontology learning”. However, most approaches have “only” considered one step in the overall ontology engineering process [2], for example, generating concepts & relationships[3] or extracting concepts & relationship whereas one must consider the overall process when building real-world applications. In this paper, we describe our approach for ontology extraction from an existing knowledge base of heterogeneous documents. We required Information Extraction from heterogeneous text because it gives direct access to knowledge when in textual format, only relevant information is accessed by people Knowledge Sharing. A. Background and Related Works Two main approaches have been developed in ontology extraction. The first one facilitates manual ontology engineering by providing natural language processing languages, and ontology import tools. The second approach is based on machine learning and automated language processing techniques to extract concepts and ontological relations from structured and unstructured data such as databases and texts. A number of systems have been proposed for ontology extraction from text. We describe some of them in the following. ASIUM [4] extracts verb frames and taxonomic knowledge, based on statistical analysis of syntactic parsing of texts. Text-To-Onto [5] is an Open source ontology management infrastructure, with a tool suite for building ontologies from an initial core ontology. It combines knowledge acquisition and machine learning techniques to discover conceptual structures. Information Retrieval[6] is a domain independent that creates clusters of the words appearing in the text. The scope of this is to build a hierarchy of concepts. Its learning method is based on distributional approach: nouns playing the same syntactic role in sentences with the same verb are grouped together in the same class. Effective ontology management in virtual learning environments[7] is a semi-automatic data driven topic ontology which integrates machine learning and text mining algorithms. Main features are represented by automatic keyword extraction from documents given as an input to the system (the extracted keywords are “candidate concepts” of the ontology) and by the concepts suggestions generation. II. Approach For Ontology Extraction Ontology is a basic building block for semantic web[8]. An active line of research in semantic web is focused on how to build and evolve ontologies using the information from different ontological sources such as txt, doc, ppt, pdf etc inherent in the domain. A large part of the IT industry uses software engineering methodologies to build software solutions that solve real-world problems. Ontology Building process consists of following phases. Ontology Extraction from Heterogeneous Documents