Challenging Choices for Text Simplification Caroline Gasperin, Erick Maziero, and Sandra M. Alu´ ısio NILC - N´ ucleo Interinstitucional de Lingu´ ıstica Computacional ICMC, Universidade de S˜ao Paulo, S˜ao Carlos, SP, P.O. Box 668, 13560-970, Brazil {cgasperin,sandra}@icmc.usp.br, erickgm@grad.icmc.usp.br Abstract. In this paper we discuss particular choices we made dur- ing the development of a rule-based syntactic text simplification system. Such choices concern 1) how to deal with adverbial phrases in order to simplify sentences, and 2) the order in which to apply our set of simpli- fication rules. Adverbial phrases have not been considered by previous work on text simplification, but have a considerable impact on the com- plexity of a sentence. Considering our whole set of simplification rules, we discuss and compare two different orders in which to apply them: empirical and hierarchical. 1 Introduction In Brazil, according to the index used to measure the literacy level of the pop- ulation (INAF - National Indicator of Functional Literacy), a vast number of people belong to the so called rudimentary and basic literacy levels. These peo- ple are only able to find explicit information in short texts (rudimentary level) or process slightly longer texts and make simple inferences (basic level). INAF [1] reports that 7% of the individuals were classified as illiterate; 25% as literate at the rudimentary level; 40% as literate at the basic level; and only 28% as literate at the advanced level. The PorSimples project (Simplifica¸ c˜ao Textual do Portuguˆ es para Inclus˜ ao e Acessibilidade Digital 1 ) aims at producing Text Simplification (TS) tools for promoting digital inclusion and accessibility for people with such levels of liter- acy, and possibly other kinds of reading disabilities. More specifically, the goal is to help these readers to process documents available on the web. The focus is on texts published in government sites or by relevant news agencies, both expected to be of importance to a large audience with various literacy levels. The language of the texts is Brazilian Portuguese, for which there are no text simplification systems, to the best of our knowledge. TS aims to maximize the comprehension of written texts through the sim- plification of their linguistic structure. This may involve simplifying lexical and syntactic phenomena, by substituting words that are only understood by a few people with words that are more usual, and by breaking down and changing the syntactic structure of the sentence. 1 http://caravelas.icmc.usp.br/wiki/index.php/Principal T.A.S. Pardo et al. (Eds.): PROPOR 2010, LNAI 6001, pp. 40–50, 2010. c Springer-Verlag Berlin Heidelberg 2010