Students, Teachers, Exams and MOOCs: Predicting and Optimizing Attainment in Web-Based Education Using a Probabilistic Graphical Model Bar Shalem 1 , Yoram Bachrach 2 , John Guiver 2 , and Christopher M. Bishop 2 1 Bar-Ilan University, Ramat Gan, Israel 2 Microsoft Research, Cambridge, UK Abstract. We propose a probabilistic graphical model for predicting stu- dent attainment in web-based education. We empirically evaluate our model on a crowdsourced dataset with students and teachers; Teachers pre- pared lessons on various topics. Students read lessons by various teachers and then solved a multiple choice exam. Our model gets input data regard- ing past interactions between students and teachers and past student at- tainment. It then estimates abilities of students, competence of teachers and difficulty of questions, and predicts future student outcomes. We show that our model’s predictions are more accurate than heuristic approaches. We also show how demographic profiles and personality traits correlate with student performance in this task. Finally, given a limited pool of teach- ers, we propose an approach for using information from our model to max- imize the number of students passing an exam of a given difficulty, by opti- mally assigning teachers to students. We evaluate the potential impact of our optimization approach using a simulation based on our dataset, show- ing an improvement in the overall performance. 1 Introduction Recent years have marked an enormous leap in the use of the Internet and web-based technology. This technology had a huge impact on education, where web-based and online training are emerging as a new paradigm in learning [26]. Distant learning technology makes it easier to access educational resources, re- duces costs and allows extending participation in education [28,2,40]. Intelligent online educational technologies enable a deep analysis of student solutions and allows automatic tailoring of content or the difficulty of exercises to the specific student [11]. One innovation that could affect higher education is massive open online courses (MOOCs), online training geared to allow large-scale participa- tion by providing open access to resources [36,16]. MOOC providers offer a wide selection of courses, some already attracting many students. 1 1 See, for example the report on Peter Norvig and Sebastian Thrun’s online artificial intelligence course, with its “100,000 student classroom”, in http:// www.ted.com/talks/peter norvig the 100 000 student classroom.html. T. Calders et al. (Eds.): ECML PKDD 2014, Part III, LNCS 8726, pp. 82–97, 2014. c Springer-Verlag Berlin Heidelberg 2014