Text-to-text generation for question answering Wauter Bosma, Erwin Marsi, Emiel Krahmer and Mari¨ et Theune Abstract When answering questions, major challenges are (a) to carefully deter- mine the content of the answer and (b) phrase it in a proper way. In IMIX, we focus on two text-to-text generation techniques to accomplish this: content selection and sentence fusion. Using content selection, we can extend answers to an arbitrary length, providing not just a direct answer but also related information so to better address the user’s information need. In this process, we use a graph-based model to generate coherent answers. We then apply sentence fusion to combine partial an- swers from different sources into a single more complete answer, at the same time avoiding redundancy. The fusion process involves syntactic parsing, tree alignment and surface string generation. 1 Introduction Answering specific types of trivia style (so-called ‘factoid’) questions is often taken as the core domain of question answering (QA) research. An example of such a question would be: what is RSI? But what is the correct answer to such a question? Ostensibly, this is a definition question, and a plausible answer is something like RSI means Repetitive Strain Injury. But will this answer the need for information of the person asking the question? In general, it seems that even if an unambiguous Wauter Bosma VU University Amsterdam, e-mail: w.bosma@let.vu.nl Erwin Marsi Norwegian University of Science and Technology, e-mail: emarsi@idi.ntnu.no Emiel Krahmer Tilburg University, e-mail: e.j.krahmer@uvt.nl Mari¨ et Theune University of Twente, e-mail: m.theune@ewi.utwente.nl 1