International Research Journal of Engineering and Technology ( I RJET) e-ISSN: 2395 -0056 Volume: 02 Issue: 02 | May-2015 www.irjet.net p-ISSN: 2395-0072 © 2015, IRJET.NET- All Rights Reserved Page 113 Test Model for Rich Semantic Graph Representation for Hindi Text using Abstractive Method. Manjula Subramaniam 1, Prof. Vipul Dalal 2 Computer Engineering, Vidyalankar Institute of Technology, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - In this paper we present a method for summarizing Hindi Text document by creating rich semantic graph(RSG) of original document and identifying substructures of graph that can extract meaningful sentences for generating a document summary. This paper contributes the idea to summarize Hindi text document using abstractive method. We extract a set of features from each sentence that helps identify its importance in the document. It uses Hindi WordNet to identify appropriate position of word for checking SOV (Subject-Object-Verb) qualification. Therefore to optimize the summary, we find similarity among the sentences and merge the sentence which represented using Rich Semantic Sub graph which in turn produces a summarized text document. Key Wor ds: Text Analysis, Text Summarization, Abstractive Summary and Rich Semantic Graph Representation. 1. INTRODUCTION The data on World Wide Web is growing at an exponential pace. Nowadays, people use the internet to find information through information retrieval (IR) tools such as Google, Yahoo, and Bing and so on. However, with the exponential growth of information on the internet, information abstraction or summary of the retrieved results has become necessary for users. In the current era of information overload, text summarization has become an important and timely tool for user to quickly understand the large volume of information. Therefore to achieve this goal of summarizing a text document is condense the document and preserve the important contents. Nowadays there is a wide range of technologies which focuses on areas like Human Language Technology (HLT). These include areas such as Natural Language Processing (NLP), Speech Recognition, Machine Translation, Text Generation and Text Mining. In this paper, we will focus on two of these areas: NLP and Text Mining which leads to summarizing text. Text summarization is the process of extracting salient information from the source text and to present that information to the user in the form of summary. Currently, the need for text summarization has appeared in many areas such as news articles summary, email summary, short message news on mobile, and information summary for businessman, government officials, research, online search engines to receive the summary of pages found and so on[1]. Text summarization approach is broadly classified into two summary: extractive and abstractive. Extractive summary is the procedure of identifying important sections of the text and producing them verbatim while Abstractive summary aims to produce important material in a new generalized form. [1] In this paper, a novel approach is presented to generate an abstractive summary automatically for the Hindi input text document using a semantic graph reducing technique. This approach exploits a new semantic graph called Rich Semantic Graph (RSG) [3, 7].RSG is an ontology-based representation developed to be used as an intermediate representation for Natural Language Processing (NLP) applications. The new approach consists of three phases: creating a rich semantic graph for the source document, reducing the generated rich semantic graph to more abstracted graph, and finally generate the abstractive summary from the abstracted rich semantic graph. 2. BACKGROUND AND RELATED WORK Text Summarization is shorter version of the original document while still preserving the main content available in the source documents. There are various definitions on text summary in the literature. According to [8] “The aim of automatic text summarization is to condense the source text by extracting its most important content that meets a user’s or application needs”. According to [9],”A summary is a text that is produced from one or more texts that contains a significant portion of the information text(s), and is no longer than half of the original text(s)”. There ar e various effective techniques to generate extractive summary which helps to find relevant sentences to be added to the summary. This can be classified as : Statistical, Linguistic and Hybrid approach.