12 International Journal of Technology Diffusion, 3(2), 12-27, April-June 2012
Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Keywords: Automatic Text Summarization, Feature Extraction, Manual Evaluation, Natural Language
Processing, Semantics
1. INTRODUCTION
As the amount of information rapidly grows on
the internet, there are a lot of difficulties to select
the relevant information we need to satisfy our
requirements. Furthermore, publications media
varies from specialist journals to newspapers
to many other versions.
Summarization of texts or Text Summariza-
tion (TS) appears as the best solution for users
to choose and decide if this document will help
her/him or not. Summarization is the process of
producing shorter and informative presentation
of the most important information from a source
or multiple sources of information according to
Automatic Arabic Text
Summarization System
(AATSS) Based on Semantic
Features Extraction
Nabil M. Hewahi, Bahrain University, Bahrain
Kathrein Abu Kwaik, Al Azhar University, Palestine
ABSTRACT
Recently, the need has increased for an effective and powerful tool to automatically summarize text. For
English and European languages an intensive works have been done with high performance and nowadays
they look forward to multi-document and multi-language summarization. However,Arabic language still suffers
from the little attentions and research done in this fled. In this paper, we propose a model to automatically
summarizeArabic text using text extraction. Various steps are involved in the approach: preprocessing text,
extract set of features, classify sentence based on scoring method, ranking sentences and fnally generate
an extracted summary. The main difference between the proposed system and other Arabic summarization
systems are the consideration of semantics, entity objects such as names and places, and similarity factors
in our proposed system. The proposed system has been applied on news domain using a dataset osbtained
from Local newspaper. Manual evaluation techniques are used to evaluate and test the system. The results
obtained by the proposed method achieve 86.5% similarity between the system and human summarization.
A comparative study between our proposed system and Sakhr Arabic online summarization system has been
conducted. The results show that our proposed system outperforms Shakr system.
DOI: 10.4018/jtd.2012040102