Research on Language and Computation manuscript No. (will be inserted by the editor) Performing aggregation and ellipsis using discourse structures Mari¨ et Theune 1 , Feikje Hielkema 2 , Petra Hendriks 3 1 Human Media Interaction, Department of Computer Science, University of Twente, The Netherlands, e-mail: M.Theune@ewi.utwente.nl 2 Department of Computing Science, University of Aberdeen, Scotland, UK, e- mail: fhielkem@csd.abdn.ac.uk 3 Center for Language and Cognition, University of Groningen, The Netherlands, e-mail: P.Hendriks@rug.nl Received: date / Revised version: date Abstract This article describes the generation of aggregated and ellip- tic sentences, using Dependency Trees connected by rhetorical relations as input. The system we have developed can generate both hypotactic and paratactic constructions with appropriate cue words, and various forms of ellipsis such as Gapping and Conjunction Reduction. We contend that De- pendency Trees connected by rhetorical relations are excellent input for a generation system that has to generate ellipsis, and we propose a taxonomy of the most common Dutch cue words, grouped according to the kind of discourse relations they signal. Finally, we argue that syntactic aggregation should be performed in the Surface Realizer of a language generation sys- tem, because it requires access to language-specific syntactic information. Key words aggregation, dependency trees, discourse structure, ellipsis, language generation 1 Introduction Ellipsis and coordination are key features of natural language. For a Natu- ral Language Generation (NLG) system to produce fluent, coherent texts, it must be able to generate coordinated and elliptic sentences. The generation of such sentences is part of a process called aggregation, which is one of the basic tasks of any NLG system (Reiter and Dale, 2000). However, there is no consensus on the definition of aggregation. It is an amalgam of processes Feikje Hielkema carried out this work while she was at the University of Groningen.