Using psychometric technology in educational assessment: The case of a schema-based isomorphic approach to the automatic generation of quantitative reasoning items Martin Arendasy , Markus Sommer Faculty of Psychology, Differential & Personality Research Group, University of Vienna, Liebiggasse 5, A-1010 Vienna, Austria Received 19 May 2006; received in revised form 12 March 2007; accepted 17 March 2007 Abstract This article deals with the investigation of the psychometric quality and constructs validity of algebra word problems generated by means of a schema-based version of the automatic minmax approach. Based on review of the research literature in algebra word problem solving and automatic item generation this new approach is introduced as a theory-based topdown method of automatic item generation featuring a quality control framework aimed to minimize the construct unrelated variance in the item parameters. The first study deals with the evaluation of an initial set of items. The results are replicated in the second study using a larger item set which also allows the investigation of the construct representation of the generated item. Since construct unrelated variance components (e.g. reading comprehension) have been controlled for in the item generation phase the results revealed some interesting insights into the cognitive processes of the actual mathematization phase of algebra word problem solving. The third study investigated the nomothetic span is using hierarchical confirmatory factor analysis. The results argue for the convergent and discriminant validity of the automatically generated items. Taken together, the results indicate that the automatic generation of construct valid algebra word problems at a high psychometric level is viable. The discussion is thus concerned with the implications of this new approach to item generation for theory development and evaluation as well as practical benefits for educational assessment and the development of intelligent tutoring systems. © 2007 Elsevier Inc. All rights reserved. Keywords: Algebra word problems; Automatic item generation; Quantitative reasoning; Rasch Model; Educational assessment 1. Theoretical introduction In educational assessment we observe how respondents solve the test items presented to them to infer what they are capable of. However, these inferences are inherently tied to the construct validity of the individual tests. In order to obtain evidence on the construct validity of various commonly used intelligence tests researchers investigated the cognitive processes respondents use to solve various intelligence tests to gain insight into the sources of their difficulties during the solution process. The research methods used to accomplish this aim range from analyses of verbal or written protocols, response latencies and the investigation of the impact of experimental manipulations of Available online at www.sciencedirect.com Learning and Individual Differences 17 (2007) 366 383 www.elsevier.com/locate/lindif Corresponding author. Tel.: +43 1 4277 47848. E-mail address: martin.arendasy@univie.ac.at (M. Arendasy). 1041-6080/$ - see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.lindif.2007.03.005