DIVERSITY BASED TEXT SUMMARIZATION Mohammed Salem Binwahlan l , Naomie Salim 2 , Ladda Suanmali 3 I.Z Faculty of Computer Science and Information Systems University Teknologi Malaysia 81300 Skudai, Johor of Science and Technology, Soan Dusit Rajabhat University 295 Rajasrima Rd, Dusit, Bangkok. Thailand 10300 E-mail: I moham2007med@yahoo.com. z naomie@Utmmy.31adda_sua@dusitac.th Abstract: Diversity of selected sentences is an important factor in automatic text summarization to control redundancy in the summarized text. In this paper, we propose a method called maximal marginal importance (MMI) for text summarization based on the idea of the well-known diversity approach maximal marginal relevance (MMR) where an emphasis is on the diversity based binary tree is used to exploit the diversity among the document sentences, where the whole document is clustered into a number of clusters, and then each cluster is presented as one binary tree or more. In our method, the sentence is evaluated based on its importance and its relevance. Our experimental results shown that the proposed method outperfonns the three benchmark methods used in this study. Keywords: summarization, diversity, binary tree, similarity threshold. 1. INTRODUCTION The automatic text summarization has gained high importance as an active research field in the recent years. The benefits of automatic text summarization system's availability increase the need for existence of such systems; the most important benefits of using a summary is its reduced reading time and providing quick guide to the interesting information. Diversity, which refers to distinct ideas included in the document, became a very important factor in automatic text summarization to control the redundancy in the summarized text. Many approaches have been proposed for text summarization based on the diversity. For example, MMR (maximal marginal relevance) [1], maximizes marginal relevance in retrieval and summarization. The sentence with high maximal relevance means it Jilid 20, BiI.2 (Disember 2008) Jurnal Teknologi Maklumat DIVERSITY BASED TEXT SUMMARIZATION Mohammed Salem Binwahlan l , Naomie Salim 2 , Ladda Suanmali 3 1.2Facuity of Computer Science and Information Systems University Teknologi Malaysia 81300 Skudai, Johor lpaculty of Science and Technology, Suan Dusit Rajabhat University 295 Rajasrima Rd, Dusit, Bangkok, Thailand 10300 E-mail: I moham2oo7med@yahoo.com.2naomie@Utmmy.3ladda_sua@dusitac.th Abstract: Diversity of selected sentences is an important factor in automatic text summarization to control redundancy in the summarized text. In this paper, we propose a method called maximal marginal importance (MMI) for text summarization based on the idea of the well-known diversity approach maximal marginal relevance (MMR) where an emphasis is on the diversity based binary tree is used to exploit the diversity among the document sentences, where the whole document is clustered into a number of clusters, and then each cluster is presented as one binary tree or more. In our method, the sentence is evaluated based on its importance and its relevance. Our experimental results shown that the proposed method outperforms the three benchmark methods used in this study. Keywords: summarization, diversity, binary tree, similarity threshold. 1. INTRODUCTION The automatic text summarization has gained high importance as an active research field in the recent years. The benefits of automatic text summarization system's availability increase the need for existence of such systems; the most important benefits of using a summary is its reduced reading time and providing quick guide to the interesting information. Diversity, which refers to distinct ideas included in the document, became a very important factor in automatic text summarization to control the redundancy in the summarized text. Many approaches have been proposed for text summarization based on the diversity. For example, MMR (maximal marginal relevance) [1], maximizes marginal relevance in retrieval and summarization. The sentence with high maximal relevance means it Jilid 20, Bi1.2 (Disember 2008) Jurnal Teknologi Maklumat