Learning Task-related Strategies from User Data through Clustering Mihaela Cocea *† , George D. Magoulas * School of Computing, University of Portsmouth, Portsmouth, UK Email: mihaela.cocea@port.ac.uk London Knowledge Lab, Birkbeck College, University of London, London, United Kingdom Email: gmagoulas@dcs.bbk.ac.uk Abstract—In exploratory learning environments, learners can use different strategies to solve the same problem. Not all these strategies, however, are known to the teacher and, even if they were, they need considerable time and effort to introduce them in the knowledge base. In this paper we propose a learning mechanism that extracts strategies from user data and presents them to the teacher for further authoring. To this end, a clustering approach is used in which the strategies of learners are grouped into clusters and the teacher is presented with a representative strategy for each cluster. The teacher can then decide whether to store the proposed strategies or to author them further. This approach allows populating the knowledge base using user data, thus saving authoring time for the teacher. Keywords-clustering, learning from user data, exploratory learning environments I. I NTRODUCTION Exploratory learning environments (ELEs) present learners with constructionist activities [1] in which learners vary the parameters of their constructions/models and observe the im- plications on the behaviour of their models. ELEs are associ- ated with the so-called ill-defined domains [2], in which the problems are less structured and the boundaries between cor- rect and incorrect approaches to solve a task are not clearcut. Moreover, problems in these domains are characterised by having several equally valid solutions. ELEs that are equipped with guidance and support mechanisms have been proven to have positive impact on learning when compared with other structured learning environments [3]. Lack of support, however, may hinder learning [4] and outstrip the advantages of ELEs. Therefore, to make ELEs more effective, intelligent support is needed, despite the difficulties arising from their open nature. To address this, [5] proposed a learner modelling mecha- nism for monitoring learners’ actions when constructing and/or exploring models by modelling sequences of actions reflecting different strategies in solving a task. An important problem, however, remains: only a limited number of strategies are known in advance and can be introduced by the teacher. In addition, even if all strategies were known, introducing them in the knowledge base would take considerable time and effort. To reduce this time and effort, we propose a mechanism for learning strategies from user data through clustering in the context of an ELE for mathematical generalisation called eXpresser [6], which results in a number of representative strategies that can be presented to the teacher for further authoring or storing in the knowledge base. The next section briefly introduces eXpresser and gives examples of mathematical generalisation tasks. Section 3 presents the mechanism we propose for learning strategies from user data. Experimental results using data from a class- room session are presented in Section 4. Section 5 discusses the results and concludes the paper. II. THE EXPLORATORY LEARNING ENVIRONMENT eXpresser [6] is an ELE for the domain of mathematical generalisation; it is designed for 11-14 year olds and for classroom use. The tasks involve building a construction and deriving an algebraic-like rule from it. Two typical tasks, ‘pond tiling’ and ‘footpath’ are described below. ‘Pond tiling’ requires to find a general rule for sur- rounding any rectangular pond. The construction, several ways of building it and their corresponding rules are displayed in Figure 1; the variables w and h refer to the width and the height of the pond. The ‘footpath’ task requires to build a construction such as in Figure 2(a) and to find a rule for the green (lighter colour) tiles in relation to the red (darker) tiles, i.e. the footpath; some blocks of the construction are expanded for ease of visualisation; the variable red refers to the number of red tiles. In these figures, the internal structure of the constructions has been highlighted for clarity. In eXpresser all constructions would look the same in the normal course of the task. Each construction is called a strategy and is made of several patterns. For example, the construction in Figure 2(c) is made of 4 patterns: two green (lighter colour) patterns made of 7 tiles with no gaps between them, which are placed at the top and the bottom of the construction; one green pattern made of 4 tiles with gaps of one tile between them, and one red (darker colour) pattern made of 4 tiles with gaps of one tile between them. The strategies above are illustrated using one particular in- stance for each task, i.e. a ‘pond’ of width 5 and height 3, and a ‘footpath’ of 3 tiles; however, learners build constructions of various dimensions corresponding to different instances of the task. The goal is to build a construction that is general, i.e. it is correct for any instance of the task. To verify if their constructions is general, the system allows the learners to animate their construction by varying the values of the