ArikIturri: an Automatic Question Generator Based on Corpora and NLP Techniques Itziar Aldabe 1 , Maddalen Lopez de Lacalle 1 , Montse Maritxalar 1 , Edurne Martinez 2 , and Larraitz Uria 1 1 Department of Computer Languages and Systems, Computer Engineering Faculty, University of the Basque Country P.O. box 649, E-20080 Donostia, Spain {jibalari,mlopezdelaca002,montse.maritxalar,larraitz}@ehu.es 2 Computer Science Department, Udako Euskal Unibertsitatea (UEU) ELEKA Language Engineering Enterprise edurne@eleka.net Abstract. Knowledge construction is expensive for Computer Assisted As- sessment. When setting exercise questions, teachers use Test Makers to con- struct Question Banks. The addition of Automatic Generation to assessment applications decreases the time spent on constructing examination papers. In this article, we present ArikIturri, an Automatic Question Generator for Basque language test questions, which is independent from the test assessment applica- tion that uses it. The information source for this question generator consists of linguistically analysed real corpora, represented in XML mark-up language. ArikIturri makes use of NLP tools. The influence of the robustness of those tools and the used corpora is highlighted in the article. We have proved the vi- ability of ArikIturri when constructing fill-in-the-blank, word formation, multi- ple choice, and error correction question types. In the evaluation of this auto- matic generator, we have obtained positive results as regards the generation process and its usefulness. 1 Introduction Nowadays, it is widely recognized that test construction is really time-consuming and expensive for teachers. The use of Computer Assisted Assessment reduces considera- bly the time spent by teachers on constructing examination papers [11]. More specifi- cally, e-assessment helps teachers in the task of setting tests. For example, in the eLearning Place [3] learning providers create the question bank by means of a Test- Maker, a Java Virtual Machine tool. The manual construction of questions is also a fact in SIETTE[5], a web-based tool for adaptive testing. TOKA[8], a web- application for Computer Assisted Assessment, provides teachers with a platform for guided assessment in which, in addition, they construct exercises. All these tools have been used for the assessment of different subjects. However, the work we present in this article is focused on language learning. In our case, learning providers do not