Uncorrected Author Proof
Journal of Intelligent & Fuzzy Systems xx (20xx) x–xx
DOI:10.3233/JIFS-179902
IOS Press
1
Determining the importance of sentence
position for automatic text summarization
1
2
Griselda Areli Matias Mendoza, Yulia Ledeneva
∗
and Rene Arnulfo Garc´ ıa-Hern´ andez 3
Universidad Aut´ onoma del Estado de M´ exico, Unidad Acad´ emica Profesional Tianguistenco, Instituto Literario,
Toluca, Edo. Mex, M´ exico
4
5
Abstract. The methods of Automatic Extractive Summarization (AES) uses the features of the sentences of the original text
to extract the most important information that will be considered in summary. It is known that the first sentences of the text
are more relevant than the rest of the text (this heuristic is called baseline), so the position of the sentence (in reverse order)
is used to determine its relevance, which means that the last sentences have practically no possibility of being selected. In
this paper, we present a way to soften the importance of sentences according to the position. The comprehensive tests were
done on one of the best AES methods using the bag of words and n-grams models with the with DUC02 and DUC01 data
sets to determine the importance of sentences.
6
7
8
9
10
11
12
Keywords: Automatic Text Summarization, n-gram Model, bag of words model, slope calculation, genetic algorithm 13
1. Introduction 14
Currently, information is exponentially growing 15
and thus, the necessary time available for process- 16
ing. Therefore, it is essential to have methods that 17
allow Automatic Extractive Summarization (AES). 18
The purpose of the methods AES is to generate sum- 19
maries more similar to those generated by the human. 20
Presently, summaries can be used in different areas. 21
There are employed to summarize information, for 22
example, for videos [1], newspapers [2–4], scientific 23
papers [5] and social networks as Twitter [6, 7] or 24
blog [8], where information rapidly changes and tech- 25
nologies are required to access real-time information 26
represented in reduced form. 27
According to Ladda Saunmali [9], the purpose 28
of the text summary is to present the most impor- 29
tant information in a shorter version of the original 30
text, maintaining its main content and helping the 31
∗
Corresponding author. Yulia Ledeneva, Autonomous Univer-
sity of the State of Mexico, Instituto Literario No. 100, CP 50000,
Toluca, State of Mexico, Mexico. E-mail: yledeneva@yahoo.com.
user to quickly understand the large volume of infor- 32
mation. According to Alfonseca, Berker, Da Cunha 33
Fanego among others [9–17], the summaries are clas- 34
sified according to their strategy of condensation in 35
abstractive and extractive summaries. The abstrac- 36
tive summaries are those summaries generated from 37
understanding the document and describe the content 38
with words or sentences that sometimes are not in the 39
original text. Instead, extractive summaries are gen- 40
erated from the selection of key phrases, sentences, 41
or paragraphs considered essential for the original 42
text; so, they do not require the understanding of the 43
document. 44
Among the methods proposed for AES are those 45
that need a large number of language resources 46
[18–23], so they have a high dependence on lan- 47
guage or require sophisticated processes to generate 48
a summary. There are also methods that only use the 49
structure and distribution of the original text, so they 50
are less dependent on language [2, 3, 24–29]. The 51
language-dependent methods may show better results 52
than language-independent ones. However, research 53
in language-independent methods has grown because 54
ISSN 1064-1246/20/$35.00 © 2020 – IOS Press and the authors. All rights reserved