I.J. Information Technology and Computer Science, 2019, 7, 26-34
Published Online July 2019 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijitcs.2019.07.04
Copyright © 2019 MECS I.J. Information Technology and Computer Science, 2019, 7, 26-34
Automating Text Simplification Using
Pictographs for People with Language Deficits
Mai Farag Imam
Computer Science Department, Faculty of Computers and Information, Helwan University, Egypt
E-mail: maifarag@yahoo.com
Amal Elsayed Aboutabl and Ensaf H. Mohamed
Computer Science Department, Faculty of Computers and Information, Helwan University, Egypt
E-mail: {amal.aboutabl, ensaf_hussein}@fci.helwan.edu.eg
Received: 21 February 2019; Accepted: 07 June 2019; Published: 08 July 2019
Abstract—Automating text simplification is a challenging
research area due to the compound structures present in
natural languages. Social involvement of people with
language deficits can be enhanced by providing them with
means to communicate with the outside world, for instance
using the internet independently. Using pictographs
instead of text is one of such means. This paper presents a
system which performs text simplification by translating
text into pictographs. The proposed system consists of a
set of phases. First, a simple summarization technique is
used to decrease the number of sentences before
converting them to pictures. Then, text preprocessing is
performed including processes such as tokenization and
lemmatization. The resulting text goes through a spelling
checker followed by a word sense disambiguation
algorithm to find words which are most suitable to the
context in order to increase the accuracy of the result.
Clearly, using WSD improves the results. Furthermore,
when support vector machine is used for WSD, the system
yields the best results. Finally, the text is translated into a
list of images. For testing and evaluation purposes, a test
corpus of 37 Basic English sentences has been manually
constructed. Experiments are conducted by presenting the
list of generated images to ten normal children who are
asked to reproduce the input sentences based on the
pictographs. The reproduced sentences are evaluated using
precision, recall, and F-Score. Results show that the
proposed system enhances pictograph understanding and
succeeds to convert text to pictograph with precision,
recall and F-score of over 90% when SVM is used for
word sense disambiguation, also all these techniques are
not combined together before which increases the
accuracy of the system over all other studies.
Index Terms—Natural language processing, pictographic
communication, social inclusion, Text simplification, text
summarization, word sense disambiguation.
I. INTRODUCTION
Allowing children or people with cognitive disabilities
to easily and smoothly use the internet or any other
information resource helps reduce their social isolation
and thus increases their quality of life. Augmentative and
Alternative Communication (AAC) [1] helps people with
communication disabilities to be more socially active in
interpersonal communication, learning, education,
community activities, employment, and care management.
Some kinds of AAC are part of everyday communication
even for normal people. For example, a human can wave
goodbye or give a ‗thumbs up‘ instead of speaking.
However, some people have to rely on AAC most of the
time. Pictographic communication systems are considered
to be a form of AAC technology that relies on the use of
graphics, such as drawings, pictographs, and symbols.
Systems based on AAC include Blissymbolics
1
, PCS
2
,
Beta
3
, and Sclera
4
[2].
Text simplification for specific readers (e.g. children)
can be defined more broadly to include conceptual
simplification where the content is simplified as well as
form, Elaborative modification where redundancy and
explicitness are used to emphasize key points, Text
summarization to reduce text length by omitting peripheral
or inappropriate information.
The main objective of these operations is to make
information more accessible to people with reduced
literacy. Using imagery can make learning easier, more
enjoyable and interesting. Representing information in
visual form helps remembering it in the future due to the
brain‘s inherent preference of remembering images more
easily than text.
This paper is organized as follows: section 2 presents a
brief background about the approaches of text
simplification and the related work. Section 3 gives a
detailed description of the proposed system, followed by a
motivational example and experimental evaluation in
section 4. Finally, section 5 contains the conclusion and
the future work.
1 http://www.blissymbolics.org/
2 http://www.mayer-johnson.com/category/symbols-and-photos
3 http://www.betavzw.be
4 http://www.sclera.be