Generating Tailored Worked-out Problem Solutions to Help Students Learn from Examples Cristina Conati and Giuseppe Carenini Department of Computer Science University of British Columbia Vancouver, BC, Canada, V6T 1Z4 {conati, carenini}@cs.ubc.ca 1 Abstract In this paper we describe a framework that helps students learn from examples. When presenting a new example, the framework uses natural language generation techniques and a probabilistic student model to tailor the example to the student’s domain knowledge. Tailoring consists of selectively introducing gaps in the example solution, so that the student can practice applying rules learned from previous examples in problem solving episodes of difficulty adequate to her knowledge. Filling in solution gaps is part of the meta-cognitive skill known as self-explanation (generate explanations to oneself to clarify an example solution), which is crucial to effectively learn from examples. In this paper, we describe how examples with tailored solution gaps are generated and how they are used to support students in learning through gap-filling self- explanation. 1.1 Keywords Interactive Information Presentation, Computer-aided Education, User Modeling, Natural Language Generation 2 Introduction We describe a framework that helps students learn from examples by generating example problem solutions whose level of detail is tailored to the students’ domain knowledge. The framework uses natural language generation techniques and a probabilistic student model to selectively introduce gaps in the example solution, so that the student can practice applying rules learned from previous examples in problem solving episodes of difficulty adequate to her knowledge. The rationale behind varying the level of detail of an example solution lies on cognitive science studies showing that those students who self-explain examples (i.e., generate explanations to themselves to clarify an example solution) learn better than those students who read the examples without elaborating them [1]. One kind of self-explanation that these studies showed to be correlated with learning involves filling in the gaps commonly found in textbook example solutions (gap filling self-explanation). However, the same studies also showed that most students tend not to self-explain spontaneously. In the case of gap filling, this phenomenon could be due to the fact that gap filling virtually requires performing problem solving steps while studying an example. And, because problem solving can be highly cognitively and motivationally demanding [10], if the gaps in an example solution are too many or too difficult for a given student, they may hinder self- explanations aimed at filling them. We argue that, by monitoring how a student’s knowledge changes when studying a sequence of examples, it is possible to introduce in the examples solution gaps that are not too cognitively demanding, thus facilitating gap filling self-explanation and providing a smooth transition from example study to problem solving. We are testing our hypothesis by extending the SE-Coach [4], an Intelligent Tutoring System (ITS) designed to support self-explanation of physics examples like the one shown in Figure 1. This problem is novel in ITS research, as it requires sophisticated natural language generation (NLG) techniques. While the NLG field has extensively studied the process of producing text tailored to a model of the user’s inferential capabilities [e.g., 6, 7, 11], the application of NLG techniques in ITS are few and mainly focused on managing and structuring the tutorial dialogue [e.g., 8, 5], rather than on tailoring the presentation of instructional material to a detailed student model. Several NLG computational models proposed in the literature generate concise text by taking into account the inferential capabilities of the user. [11] generates effective plan descriptions tailored to the hearer’s plan reasoning capabilities. [6] is an example of models that take into account the hearer’s logical inference capabilities. And [7] proposes a system that relies on a model of user’s probabilistic inferences to generate sufficiently persuasive arguments. In contrast, our generation system tailors the content and organization of an example to a probabilistic model of the user logical inferences, which allows us to explicitly represent the inherent uncertainty involved in assessing a learner’s knowledge and reasoning processes. Furthermore, our system maintains information on what example parts