Wu, H.-M., Kuo, B.-C., & Yang, J.-M. (2012). Evaluating Knowledge Structure-based Adaptive Testing Algorithms and System Development. Educational Technology & Society, 15 (2), 73–88. 73 ISSN 1436-4522 (online) and 1176-3647 (print). © International Forum of Educational Technology & Society (IFETS). The authors and the forum jointly retain the copyright of the articles. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear the full citation on the first page. Copyrights for components of this work owned by others than IFETS must be honoured. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from the editors at kinshuk@ieee.org. Evaluating Knowledge Structure-based Adaptive Testing Algorithms and System Development Huey-Min Wu, Bor-Chen Kuo and Jinn-Min Yang Research Center for Testing and Assessment, National Academy for Educational Research, New Taipei City, Taiwan // Graduate Institute of Educational Measurement and Statistic, National Taichung University of Education, Taichung, Taiwan // Department of Mathematics Education, National Taichung University of Education, Taichung, Taiwan // lhswu@seed.net.tw // kbc@mail.ntcu.edu.tw // ygm@ms3.ntcu.edu.tw (Submitted September 11, 2009; Revised January 26, 2011; Accepted March 31, 2011) ABSTRACT In recent years, many computerized test systems have been developed for diagnosing students’ learning profiles. Nevertheless, it remains a challenging issue to find an adaptive testing algorithm to both shorten testing time and precisely diagnose the knowledge status of students. In order to find a suitable algorithm, four adaptive testing algorithms, based on ordering theory, item relational structure theory, Diagnosys, and domain experts, were evaluated based on the training sample size, prediction accuracy, and the use of test items by the simulation study with paper-based test data. Based on the results of simulation study, ordering theory has the best performance. An ordering-theory-based knowledge-structure-adaptive testing system was developed and evaluated. The results of this system showed that the two different interfaces, paper-based and computer-based, did not affect the examinees’ performance. In addition, the effect of correct guessing was discussed, and two methods with adaptive testing algorithms were proposed to mitigate this effect. The experimental results showed that the proposed methods improve the effect of correct guessing. Keywords Adaptive test algorithm, Computerized adaptive test, Diagnostic test, Knowledge structure, Ordering theory Introduction During the last two decades, from the functional aspect, many computerized test systems have been developed for estimating abilities of examinees (Chang, Lin, & Lin, 2007; Guzman & Conejo, 2005; Lewis & Sheehan, 1990; Sands, Water, & McBride, 1997; Sheehan & Lewis, 1992; Wainer, 2000; van der Linden, 2000; Tao, Wu, & Chang, 2008; Yen, Ho, Chen, Chou, & Chen, 2010) or diagnosing students’ learning profiles (Appleby, Samuels, & Treasure-Jones, 1997; Chang, Liu, & Chen, 1998; Hwang, Hsiao, & Tseng, 2003; Liu, 2005; Tsai & Chou, 2002; Tselios, Stoica, Maragoudakis, Avouris, & Komis, 2006; Vomlel, 2004;Yu & Yu, 2006). From the theoretical aspect, some of them are based on item-response theory (IRT) (Chang et al., 2007; Guzman & Conejo, 2005; Lewis & Sheehan, 1990; Sands et al., 1997; Sheehan & Lewis, 1992; Wainer, 2000; van der Linden, 2000; Yen, et al., 2010), some of them are based on artificial intelligence techniques such as Bayesian networks (Liu, 2005; Tselios et al., 2006; Vomlel, 2004), and others are based on knowledge structures. From the operational aspect, some of the computerized tests are adaptive and others are non-adaptive. The focus of this study is to construct computerized adaptive tests based on knowledge structures for diagnosing students’ learning profiles. The computerized adaptive test (CAT) can not only offer examinees customized items in accordance with their aptitudes or cognitive status, but can also shorten the test. The CAT based on IRT models can obtain efficient estimates of subjects’ abilities, but it cannot provide the capability to diagnose subjects’ cognitive concepts at a detailed level (Tatsuoka, Corter, & Tatsuoka, 2004; Yan, Almond, & Mislevy, 2004). Instead, knowledge structure- or artificial-intelligence-based adaptive tests can provide information about how well subjects performed on specific concepts, so they can achieve the diagnostic function (Appleby et al., 1997; Tatsuoka et al., 2004; Vomlel, 2004). Diagnosys, developed by Appleby et al. (1997), is a knowledge-based-computer diagnostic test of basic mathematical concepts. In Diagnosys, a method was proposed to estimate the knowledge structure of examinees and then apply this structure to build the adaptive testing process. Chang et al. (1998) have proposed adaptive test algorithms to construct a computerized adaptive diagnostic test based on knowledge structures constructed by the domain experts. The results of these two papers exhibit that the proposed algorithms have the capability of decreasing the use of test items and are able to precisely diagnose the cognitive status of examinees. However, the impact of correct guessing on the diagnoses of concepts is not considered in these studies. Correct guessing means