Using psychometric technology in educational assessment: The case
of a schema-based isomorphic approach to the automatic
generation of quantitative reasoning items
Martin Arendasy
⁎
, Markus Sommer
Faculty of Psychology, Differential & Personality Research Group, University of Vienna, Liebiggasse 5, A-1010 Vienna, Austria
Received 19 May 2006; received in revised form 12 March 2007; accepted 17 March 2007
Abstract
This article deals with the investigation of the psychometric quality and constructs validity of algebra word problems generated
by means of a schema-based version of the automatic min–max approach. Based on review of the research literature in algebra
word problem solving and automatic item generation this new approach is introduced as a theory-based top–down method of
automatic item generation featuring a quality control framework aimed to minimize the construct unrelated variance in the item
parameters. The first study deals with the evaluation of an initial set of items. The results are replicated in the second study using a
larger item set which also allows the investigation of the construct representation of the generated item. Since construct unrelated
variance components (e.g. reading comprehension) have been controlled for in the item generation phase the results revealed some
interesting insights into the cognitive processes of the actual mathematization phase of algebra word problem solving. The third
study investigated the nomothetic span is using hierarchical confirmatory factor analysis. The results argue for the convergent and
discriminant validity of the automatically generated items. Taken together, the results indicate that the automatic generation of
construct valid algebra word problems at a high psychometric level is viable. The discussion is thus concerned with the
implications of this new approach to item generation for theory development and evaluation as well as practical benefits for
educational assessment and the development of intelligent tutoring systems.
© 2007 Elsevier Inc. All rights reserved.
Keywords: Algebra word problems; Automatic item generation; Quantitative reasoning; Rasch Model; Educational assessment
1. Theoretical introduction
In educational assessment we observe how respondents solve the test items presented to them to infer what they are
capable of. However, these inferences are inherently tied to the construct validity of the individual tests. In order to
obtain evidence on the construct validity of various commonly used intelligence tests researchers investigated the
cognitive processes respondents use to solve various intelligence tests to gain insight into the sources of their
difficulties during the solution process. The research methods used to accomplish this aim range from analyses of
verbal or written protocols, response latencies and the investigation of the impact of experimental manipulations of
Available online at www.sciencedirect.com
Learning and Individual Differences 17 (2007) 366 – 383
www.elsevier.com/locate/lindif
⁎
Corresponding author. Tel.: +43 1 4277 47848.
E-mail address: martin.arendasy@univie.ac.at (M. Arendasy).
1041-6080/$ - see front matter © 2007 Elsevier Inc. All rights reserved.
doi:10.1016/j.lindif.2007.03.005