Intelligent Control and Automation, 2010, 1, 105-111 doi:10.4236/ica.2010.12012 Published Online November 2010 (http://www.SciRP.org/journal/ica) Copyright © 2010 SciRes. ICA Multi-Document Summarization Model Based on Integer Linear Programming Rasim Alguliev, Ramiz Aliguliyev, Makrufa Hajirahimova Institute of Information Technology of National Academy of Sciences of Azerbaijan E-mail: a.ramiz@science.az Received August 28, 2010; revised October 1, 2010; accepted October 3, 2010 Abstract This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming problem. An advantage of this model is that it can cover the main content of source (s) and provide less redundancy in the generated sum- maries. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, have been used the similarity of sentences to the original document and the similarity between sentences. Performance evaluation is conducted by comparing summarization outputs with manual summaries of DUC2004 dataset. Experiments showed that the proposed approach outperforms the related methods. Keywords: Multi-Document Summarization, Content Coverage, Less Redundancy, Integer Linear Programming 1. Introduction With the rapid growth of the Internet and information explosion automatic document summarization has drawn increasing attention in the past. The explosion of elec- tronic documents has made it difficult for users to extract useful information from them, and a lot of relevant and interesting documents are not read by the user due to the large amount of information [1]. The information overload problem can be reduced by text summarization. Automatic document summariza- tion aims to condense the original text into essential content and to assist in filtering and selection of neces- sary information. Present search engines usually pro- vide a short summary for each retrieved document in order that users can quickly skim through the main content of the page. Therefore it saves users time and improves the search engine’s service quality [2]. That is why the necessity of tools that automatically generate summaries arises. These tools are not just for profes- sionals who need to find the information in a short time but also for large searching engines such as Google, Yahoo!, AltaVista, and others, which could obtain a lot of benefits in its results if they use automatic generated summaries. After that, the user only will require the interesting documents, reducing the flow information [1,3]. Depending on the number of documents to be summa- rized, the summary can be a single-document or a multi-document [4-6]. Single-document summarization can only condense one document into a shorter represen- tation, whereas multi-document summarization can con- dense a set of documents into a summary. Multidocu- ment summarization can be considered as an extension of single-document summarization and used for precisely describing the information contained in a cluster of documents and facilitate users to understand the docu- ment cluster. Since it combines and integrates the infor- mation across documents, it performs knowledge synthe- sis and knowledge discovery, and can be used for knowledge acquisition [5,7]. This paper focuses on the multi-document summariza- tion. It models text summarization task as an optimiza- tion problem. This model directly discovers key sen- tences in the given collection and covers the main con- tent of the original source(s). The model implemented on multi-document summarization task. Experiments on DUC2004 datasets showed that the proposed approach