Predicting Student Performance using Advanced Learning Analytics Ali Daud a,d a Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia adfmohamad@kau.edu.sa Naif Radi Aljohani a a Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia nraljohani@kau.edu.sa Rabeeh Ayaz Abbasi a,b b Department of Computer Sciences, Quaid-i-Azam University, Islamabad, Pakistan rabbasi@kau.edu.sa Miltiadis D. Lytras c c Computer Information Systems Department, The American College of Greece, Greece mlytras@acg.edu Farhat Abbas d d Department of Computer Science and Software Engineering, International Islamic University, Islamabad, Pakistan farhatabbas421@gmail.com Jalal S. Alowibdi e e Faculty of Computing and Information Technology, University of Jeddah, Saudi Arabia jalowibdi@uj.edu.sa ABSTRACT Educational Data Mining (EDM) and Learning Analytics (LA) research have emerged as interesting areas of research, which are unfolding useful knowledge from educational databases for many purposes such as predicting students’ success. The ability to predict a student’s performance can be beneficial for actions in modern educational systems. Existing methods have used features which are mostly related to academic performance, family income and family assets; while features belonging to family expenditures and studentspersonal information are usually ignored. In this paper, an effort is made to investigate aforementioned feature sets by collecting the scholarship holding studentsdata from different universities of Pakistan. Learning analytics, discriminative and generative classification models are applied to predict whether a student will be able to complete his degree or not. Experimental results show that proposed method significantly outperforms existing methods due to exploitation of family expenditures and studentspersonal information feature sets. Outcomes of this EDM/LA research can serve as policy improvement method in higher education. CCS CONCEPTS Computing methodologies~Supervised learning by classification Applied computing~Education KEYWORDS Learning Analytics (LA); Educational Data Mining (EDM); Student Performance Prediction; Family Expenditures; Students Personal Information 1 © 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC BY 4.0 License. WWW 2017 Companion, April 3-7, 2017, Perth, Australia. ACM 978-1-4503-4914-7/17/04. DOI: http://dx.doi.org/10.1145/3041021.3054164 1. INTRODUCTION Students are the main stakeholders of institutions/universities and their performance plays a significant role in a country’s social and economic growth by producing creative graduates, innovators and entrepreneurs [26]. Educational Data mining has emerged as very important area of research to reveal presentable and applicable knowledge from large educational data repositories. Data mining algorithms are used to obtain the hidden information and desired benefits from these large data repositories [17]. There is a critical demand for academic institutions to maintain and to integrate large datasets of learners for multipurpose decision making. The use of web technology has also become an integral part of the current era of education in many universities, increasing the actual amount of data about students, teachers and their interactions with learning and educational systems [7,15]. Higher education plays an important role in the development of a society. It is a field which provides a large amount of data about participants such as students, teachers, facilities and curricula [21]. The performance of students is a main concern of various stakeholders including educators, administrators and corporations. For recruiting fresh graduates, academic achievement is the main factor considered by the recruiting agencies. Therefore, graduates have to work hard for excellent grades, so that they may rise up to the expectations of recruiting agencies [26]. The sources of educational data may be broadly divided into two categories. The first category comprises centralized educational systems such as LMS; ‘centralized’ here means that the educational data in the analytics come from one source [4]. The second comprises the de-centralized educational data retrieved from different systems’ resources, such as WWW data, massive open online courses that make use of many systems to deliver the learning materials [5,18]. Recently, analysis of educational data, for instance learning analytics, academic analytics, educational data mining, predictive analytics and learners’ analytics has emerged as an innovative area of research [22]. The commonality between all of these terms is the use of educational data for multiple purposes. Recently, a new term has been introduced: ‘educational data science’ that clarifies how different disciplines and researchers with different research interests and backgrounds can work in this area [15]. 415