I.J. Information Technology and Computer Science, 2019, 7, 26-34 Published Online July 2019 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2019.07.04 Copyright © 2019 MECS I.J. Information Technology and Computer Science, 2019, 7, 26-34 Automating Text Simplification Using Pictographs for People with Language Deficits Mai Farag Imam Computer Science Department, Faculty of Computers and Information, Helwan University, Egypt E-mail: maifarag@yahoo.com Amal Elsayed Aboutabl and Ensaf H. Mohamed Computer Science Department, Faculty of Computers and Information, Helwan University, Egypt E-mail: {amal.aboutabl, ensaf_hussein}@fci.helwan.edu.eg Received: 21 February 2019; Accepted: 07 June 2019; Published: 08 July 2019 Abstract—Automating text simplification is a challenging research area due to the compound structures present in natural languages. Social involvement of people with language deficits can be enhanced by providing them with means to communicate with the outside world, for instance using the internet independently. Using pictographs instead of text is one of such means. This paper presents a system which performs text simplification by translating text into pictographs. The proposed system consists of a set of phases. First, a simple summarization technique is used to decrease the number of sentences before converting them to pictures. Then, text preprocessing is performed including processes such as tokenization and lemmatization. The resulting text goes through a spelling checker followed by a word sense disambiguation algorithm to find words which are most suitable to the context in order to increase the accuracy of the result. Clearly, using WSD improves the results. Furthermore, when support vector machine is used for WSD, the system yields the best results. Finally, the text is translated into a list of images. For testing and evaluation purposes, a test corpus of 37 Basic English sentences has been manually constructed. Experiments are conducted by presenting the list of generated images to ten normal children who are asked to reproduce the input sentences based on the pictographs. The reproduced sentences are evaluated using precision, recall, and F-Score. Results show that the proposed system enhances pictograph understanding and succeeds to convert text to pictograph with precision, recall and F-score of over 90% when SVM is used for word sense disambiguation, also all these techniques are not combined together before which increases the accuracy of the system over all other studies. Index Terms—Natural language processing, pictographic communication, social inclusion, Text simplification, text summarization, word sense disambiguation. I. INTRODUCTION Allowing children or people with cognitive disabilities to easily and smoothly use the internet or any other information resource helps reduce their social isolation and thus increases their quality of life. Augmentative and Alternative Communication (AAC) [1] helps people with communication disabilities to be more socially active in interpersonal communication, learning, education, community activities, employment, and care management. Some kinds of AAC are part of everyday communication even for normal people. For example, a human can wave goodbye or give a ‗thumbs up‘ instead of speaking. However, some people have to rely on AAC most of the time. Pictographic communication systems are considered to be a form of AAC technology that relies on the use of graphics, such as drawings, pictographs, and symbols. Systems based on AAC include Blissymbolics 1 , PCS 2 , Beta 3 , and Sclera 4 [2]. Text simplification for specific readers (e.g. children) can be deﬁned more broadly to include conceptual simpliﬁcation where the content is simpliﬁed as well as form, Elaborative modiﬁcation where redundancy and explicitness are used to emphasize key points, Text summarization to reduce text length by omitting peripheral or inappropriate information. The main objective of these operations is to make information more accessible to people with reduced literacy. Using imagery can make learning easier, more enjoyable and interesting. Representing information in visual form helps remembering it in the future due to the brain‘s inherent preference of remembering images more easily than text. This paper is organized as follows: section 2 presents a brief background about the approaches of text simplification and the related work. Section 3 gives a detailed description of the proposed system, followed by a motivational example and experimental evaluation in section 4. Finally, section 5 contains the conclusion and the future work. 1 http://www.blissymbolics.org/ 2 http://www.mayer-johnson.com/category/symbols-and-photos 3 http://www.betavzw.be 4 http://www.sclera.be