Current Developments in Technology-Assisted Education (2006) 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 © FORMATEX 2006 Automatic Identification of Terms for the Generation of Stu- dents’ Concept Maps D. Perez-Marin *,1 , I. Pascual-Nieto 1 , E. Alfonseca 1,2 and P. Rodriguez 1 1 Universidad Autonoma de Madrid, C/ Francisco Tomás y Valiente 11, 20849, Madrid, Spain. 2 Tokyo Institute of Technology, 4259 Nagatsuta Midori-ku Yokohama 226-8503 Japan. Willow [1], an adaptive multilingual free-text Computer-Assisted Assessment system, automatically evaluates students’ free-text answers given a set of correct ones. This paper presents an extension of the system in order to generate the students’ concept maps while they are being assessed. To that aim, a new module for the automatic identification of the terms of a particular knowledge field has been created. It identifies and keeps track of the terms that are being used in the students’ answers, and calculates a confi- dence score of the student's knowledge about each term. An empyrical evaluation using the students' real answers show that it is robust enough to generate a good set of terms from a very small set of answers. Keywords automatic term identification; concept maps; e-assessment; e-learning 1. Introduction Concept maps can be defined as visual illustrations displaying the organization of concepts and outlining the relationship among or between these concepts. Traditionally, teachers ask their students to draw their concept maps about a certain knowledge field. In this way, they can review how well the students understand these concepts. Moreover, they can find possible misconceptions by looking at how students have related the concepts [2]. Despite their seeming usefulness, concept maps are not yet a common representational media, and they are not used extensively in the classrooms. This could be due to the fact that it is time consuming to learn how to create them, and they are difficult to manage in paper [3-4]. Therefore, it would be interesting to automate the generation of the students’ concept maps. As we show later, this can be done from the students’ answers to a free-text Computer Assisted Assessment (CAA) system [5] such as Willow [6]. In order to build this concept map, the identification of the most important concepts in the students’ answers is a necessary first step. A term is usually defined as a word or a multi-word expression that is used in specific domains with a specific meaning. Term extraction is an important problem in the Natural Language Processing (NLP) area [7]. Proposed solutions to term extraction usually analyse large collection of domain-specific texts and compare them to general-purpose text, in order to find domain-specific regularities that indicate that a particular word or multi-word expression is a relevant term in that domain. Term candidates are usually returned ranked according to some specific metric or weight that indicates its relevancy. In this work we focus on nominal terms (nouns or multi-word noun phrases), and do not consider domain-specific verbs. Therefore, throughout this paper, the word “term” is used to refer to nominal terms only. Several techniques have been devised to identify and extract the terms of a text: 1. Statistical corpus-based approaches such as in [8,9]. 2. Linguistic processing techniques such as part-of-speech patterns, or the use of parsers [10,11]. 3. Hybrid approaches which combine statistical techniques and linguistic knowledge [12,13]. Concepts are usually labeled by terms [14] and a traditional procedure to choose them was by consulting a group of experts or assessors [15]. However, there are some critics to this approach, as leaving the decision to humans make it subjective [16] and two humans tend not to agree completely. Up to our knowledge, no previous attempt before this article has been done to use NLP techniques to automatically extract the terms for generating concept maps for educational purposes. This would be * Corresponding author: e-mail: diana.perez@uam.es, Phone: +34 91 497 22 67, Fax: +34 91 497 22 35