DOI: 10.4018/IJKM.2020010104 International Journal of Knowledge Management Volume 16 • Issue 1 • January-March 2020 Copyright © 2020, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. 83 A Hybrid Approach to Retrieve Knowledge from a Document Deepak Sahoo, IIIT-Bhubaneswar, Bhubaneswar, India Rakesh Chandra Balabantaray, IIIT Bhubaneswar, Bhubaneswar, India ABSTRACT The task of retrieving the theme of a document and presenting a shorter form compared to the original text to the user is a challenging assignment. In this article, a hybrid approach to extract knowledge from a text document is presented, in which three key sentence level relationships in association with the Markov clustering algorithm is used to cluster sentences in the document. After clustering, sentences are ranked in each cluster and the highest ranked sentences in each cluster are merged. In the end, to get the final theme of the document, the Gradient boosting technique XGboost is used to compress the newly generated sentence. The DUC-2002 data set is used to evaluate the proposed system and it has been observed that the performance of the proposed system is better than other existing systems. KeywoRDS Knowledge Retrieval, Rouge Score, Sentence Clustering, Sentence Compression, Sentence Merging, XGBoost INTRoDUCTIoN Knowledge management (KM) is a method originated in the business world for unifying the huge amounts of documents generated from meetings, proposals, presentations, analytic papers, training materials (Bordoni et al., 2002). The documents created in an organization represent its potential knowledge. “Potential” because only parts of this data and information will be found helpful to be used by them to create organizational knowledge. In this view, one major challenge is the selection of relevant information from vast amounts of documents, and the ability of making it available for use and re-use by organization members. The objective of the “mainstream” of knowledge management is to ensure that the right information is delivered to the right person at the right time, in order to take the most appropriate decision. In this sense, KM is not aimed at managing knowledge per se, but to relate knowledge and its usage. Along with this line, we focus on the extraction of relevant information to be delivered to a decision maker. The knowledge pyramid has been used for several years to illustrate the hierarchical relationships between data, information, knowledge, and wisdom. The revised knowledge pyramid model proposed by (Jennex, 2013, 2017) includes knowledge management as extraction of reality with a focus on organizational learning. To this end, a range of Text Mining (TM) and Natural Language Processing (NLP) techniques can be used as an effective Knowledge Management System (KMS) supporting the extraction of relevant information from large amounts of unstructured textual data and, thus, the creation of knowledge (Bordoni et al., 2002). There has been an explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to