Comparing attrition prediction in FutureLearn and edX MOOCs Ruth Cobos Computer Science and Engineering Universidad Autónoma de Madrid, Spain ruth.cobos@uam.es Adriana Wilde Electronics and Computer Science University of Southampton United Kingdom agw106@ecs.soton.ac.uk Ed Zaluska Electronics and Computer Science University of Southampton United Kingdom ejz@ecs.soton.ac.uk ABSTRACT There are a number of similarities and differences between FutureLearn MOOCs and those offered by other platforms, such as edX. In this research we compare the results of applying machine learning algorithms to predict course attrition for two case studies using datasets from a selected FutureLearn MOOC and an edX MOOC of comparable structure and themes. For each we have computed a number of attributes in a pre-processing stage from the raw data available in each course. Following this, we applied several machine learning algorithms on the pre-processed data to predict attrition levels for each course. The analysis suggests that the attribute selection varies in each scenario, which also impacts on the behaviour of the predicting algorithms. Author Keywords MOOCs, predictive model, learning analytics, attribute selection, FutureLearn, edX. ACM Classification Keywords • Applied computing~Education~Interactive learning environments. • Social and professional topics~Informal education • Human-centered computing~Collaborative and social computing systems and tools • Computing methodologies~Feature selection INTRODUCTION The advances in telecommunications in the last decade, together with an increased accessibility to personal computers and internet-enabled devices have revolutionised teaching and learning. This increased accessibility has meant that for more than 35 million students, geographical and economical barriers to learning have been overcome by accessing Massive Open Online Courses (MOOCs) offered by more than 500 universities. This is a figure which has doubled from 2014 to 2015, and is expected to continue to increase, given that (according to Class Central [1]) “1800+ free online courses are starting in October 2016; 206 of them are new”. The richness of the diversity of learning with MOOCs provides unprecedented opportunities for study, and in tackling this diversity, it helps to understand the principles and affordances given by platforms used by FutureLearn respect to another well-recognised MOOC provider who offer exemplar courses which could be used for a comparative study (such as edX). Against this background, we investigated whether the inherent similarities and differences between the affordances provided by various MOOCs platforms (FutureLearn and edX respectively) may influence learner behaviours (assuming all other things equal) and whether there is an observable factor that can be used as an early predictor for attrition in either case. This is especially valuable as it could be used to inform interventions intending to improve learners’ performance in future courses. The structure of this paper is as follows: in the section Positioning FutureLearn Courses we describe the MOOC offering against some theoretical underpinnings, and also describe the practical organisation of one exemplar course, contrasting it against that of a comparable edX course. In the section Learning Analytics we also revise related work on learning analytics, which predominantly had been concerned with studying dropout and in demonstrating the feasibility of machine learning algorithms for classification and prediction. In the section titled Context of the present approach the research questions are specified and the processes conducted in addressing them are described in the Methodology section alongside a detailed description of the courses selected (as the context of our study) and other technical details. The results are shown in the Analysis of Results and Discussion section, and the insights obtained are summarised in the section titled Conclusions and Future Work, where we also identify avenues for further research. Paste the appropriate copyright/license statement here. ACM now supports three different publication options: • ACM copyright: ACM holds the copyright on the work. This is the historical approach. • License: The author(s) retain copyright, but ACM receives an exclusive publication license. • Open Access: The author(s) wish to pay for the work to be open access. The additional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement assuming it is single-spaced in Times New Roman 8-point font. Please do not change or modify the size of this text box. Each submission will be assigned a DOI string to be included here.