In Proceedings of the NL 2002 Conference (Berlin, Germany, May 2002). Optimizing and profiling users online with Bayesian probabilistic modeling Petri Nokelainen, Henry Tirri, Miikka Miettinen, Tomi Silander Helsinki Institute for Information Technology, P.O.Box 9800, FIN-02015 Helsinki University of Technology firstname.lastname@hiit.fi Jaakko Kurhila Department of Computer Science P.O.Box 26, FIN-00014 University of Helsinki jaakko.kurhila@cs.helsinki.fi Abstract One solution to build adaptive educational material is to model the user with a questionnaire before he/she enters the system, and then use this information to carry out adaptation of the platform. For example, users that are profiled could be offered personalised links to resources based on their metacognitive strategies or intrinsic goal orientations. These machine understandable beliefs of the profiles of different users could then be updated by collecting additional information with on-line questionnaire in regular intervals. An adaptive on-line questionnaire system EDUFORM is based on intelligent techniques that optimize the number of propositions presented to each respondent. In addition EDUFORM creates an individual profile for each respondent. The adaptive graphical user interface is generated automatically (e.g., propositions in the questionnaire, collaborative actions and links to resources), and profile analysis and the related selection of order of the propositions is performed with Bayesian probabilistic modeling. Preliminary testing implies that the obvious advantage with EDUFORM is that the questionnaires are usually significantly shorter compared to traditional non- adaptive questionnaires. The empirical results show that after reducing dramatically the number of propositions (from 50-60%) one is still able to control the error ratio (12-22%). In the context of course feedback from a web-based course, the model construction in the Profile creation phase can offer he1p for teachers to find differences among the various learner groups so that different versions of the web course can be prepared to suit the individual needs of the group. The correct profile information of the respondent is in most cases obtained already with less than 33% of the original proposition set. Main goal The main goal of this paper is to describe the design and implementation of a software module, called EDUFORM 1 , a Web-based data gathering tool, which performs adaptive and dynamic optimization of the number of questionnaire propositions during the actual data gathering process. This is achieved by probabilistic modeling techniques that allow for profiling the respondents based on the data gathered. EDUFORM uses probabilistic Bayesian modeling [1] to create the respondent profiles, and these models can be used to optimize dynamically the set of propositions that are showed to the u ser in order to maximally extract the information. It should be observed that although we are discussing the adaptive techniques in the context of (course) questionnaires, many of the features used in this restricted evaluation task can be directly applied to wider context of modern computer-based learning environments [2]. The educational problems investigated in this study are two-fold: 1. A great number of questionnaires, both on paper and electronic form, are designed with "one size fits all" - principle. Equipped with numerous propositions, usually around one hundred, along with some inadequate propositions related to the theory or underlying model, they prolong the answering process decreasing internal, external and contextual validity. 2. Learning environments are not effectively profiling learners which would allow the systems to promote collaborative and cooperative learning, or provide possibility to develop adaptive user interfaces and personalized contents and reference to additional resources. 1 http://eduform.cs.helsinki.fi/software.html