A minimally supervised approach for question generation: what can we learn from a single seed? S´ ergio Curto, Ana Cristina Mendes and Lu´ ısa Coheur Spoken Language Systems Laboratory - L 2 F/INESC-ID Instituto Superior T´ ecnico, Technical University of Lisbon R. Alves Redol, 9 - 2 o – 1000-029 Lisboa, Portugal {sergio.curto,ana.mendes,luisa.coheur}@l2f.inesc-id.pt Abstract. In this paper, we investigate how many quality natural language ques- tions can be generated from a single question/answer pair (a seed). In our approach we learn patterns that relate the various levels of linguistic in- formation in the question/answer seed with the same levels of information in text. These patterns contain lexical, syntactic and semantic information and when matched against a target document, new question/answer pairs can be generated. Here, we focus speciﬁcally on the task of generating questions. Several works, for instance in Question Answering, explore the re-writing of questions to create (usually lexical) patterns; instead, we use several levels of lin- guistic information – lexical, syntactic and semantic (through the use of named entities). Also, the patterns are commonly hand-crafted, as opposed to our strat- egy where patterns are automatically learned, based on a single seed. Preliminary results show that with the single question/answer seed pair – “When was Leonardo da Vinci Born?”/1452 – we manage to generate several questions (from documents related with 25 personalities), from which 80% were evaluated as plausible. 1 Introduction Question Generation (QG) has became an appealing line of research. Several workshops have been exclusively dedicated to this topic, including a shared evaluation challenge with the goal of generating questions from paragraphs and sentences [17]. The interest in QG has recently increased for several reasons. On one hand, generating questions (and answers) can be useful for Question-Answering (QA) or Dialogue Systems, as QG can act as a provider of questions to train a system to operate in a new domain. On the other hand, QG shows potential in tasks related with knowledge assessment, in two different perspectives: by reducing the amount of time allocated for the creation of tests by teachers which, if done manually, can be a time consuming trial and error process; by allowing the self evaluation of the knowledge acquired by learners.