Predicting Student Performance
using Advanced Learning Analytics
Ali Daud
a,d
a
Faculty of Computing and
Information Technology,
King Abdulaziz University,
Jeddah, Saudi Arabia
adfmohamad@kau.edu.sa
Naif Radi Aljohani
a
a
Faculty of Computing and
Information Technology,
King Abdulaziz University,
Jeddah, Saudi Arabia
nraljohani@kau.edu.sa
Rabeeh Ayaz Abbasi
a,b
b
Department of Computer Sciences,
Quaid-i-Azam University,
Islamabad, Pakistan
rabbasi@kau.edu.sa
Miltiadis D. Lytras
c
c
Computer Information Systems
Department, The American College
of Greece, Greece
mlytras@acg.edu
Farhat Abbas
d
d
Department of Computer Science
and Software Engineering,
International Islamic University,
Islamabad, Pakistan
farhatabbas421@gmail.com
Jalal S. Alowibdi
e
e
Faculty of Computing and
Information Technology, University
of Jeddah, Saudi Arabia
jalowibdi@uj.edu.sa
ABSTRACT
Educational Data Mining (EDM) and Learning Analytics (LA)
research have emerged as interesting areas of research, which
are unfolding useful knowledge from educational databases for
many purposes such as predicting students’ success. The ability
to predict a student’s performance can be beneficial for actions
in modern educational systems. Existing methods have used
features which are mostly related to academic performance,
family income and family assets; while features belonging to
family expenditures and students’ personal information are
usually ignored. In this paper, an effort is made to investigate
aforementioned feature sets by collecting the scholarship
holding students’ data from different universities of Pakistan.
Learning analytics, discriminative and generative classification
models are applied to predict whether a student will be able to
complete his degree or not. Experimental results show that
proposed method significantly outperforms existing methods
due to exploitation of family expenditures and students’ personal
information feature sets. Outcomes of this EDM/LA research can
serve as policy improvement method in higher education.
CCS CONCEPTS
• Computing methodologies~Supervised learning by
classification • Applied computing~Education
KEYWORDS
Learning Analytics (LA); Educational Data Mining (EDM);
Student Performance Prediction; Family Expenditures; Students
Personal Information
1
© 2017 International World Wide Web Conference Committee (IW3C2),
published under Creative Commons CC BY 4.0 License.
WWW 2017 Companion, April 3-7, 2017, Perth, Australia.
ACM 978-1-4503-4914-7/17/04.
DOI: http://dx.doi.org/10.1145/3041021.3054164
1. INTRODUCTION
Students are the main stakeholders of institutions/universities and
their performance plays a significant role in a country’s social and
economic growth by producing creative graduates, innovators and
entrepreneurs [26]. Educational Data mining has emerged as very
important area of research to reveal presentable and applicable
knowledge from large educational data repositories. Data mining
algorithms are used to obtain the hidden information and desired
benefits from these large data repositories [17]. There is a critical
demand for academic institutions to maintain and to integrate
large datasets of learners for multipurpose decision making. The
use of web technology has also become an integral part of the
current era of education in many universities, increasing the actual
amount of data about students, teachers and their interactions with
learning and educational systems [7,15].
Higher education plays an important role in the development of a
society. It is a field which provides a large amount of data about
participants such as students, teachers, facilities and curricula
[21]. The performance of students is a main concern of various
stakeholders including educators, administrators and corporations.
For recruiting fresh graduates, academic achievement is the main
factor considered by the recruiting agencies. Therefore, graduates
have to work hard for excellent grades, so that they may rise up to
the expectations of recruiting agencies [26].
The sources of educational data may be broadly divided into two
categories. The first category comprises centralized educational
systems such as LMS; ‘centralized’ here means that the
educational data in the analytics come from one source [4]. The
second comprises the de-centralized educational data retrieved
from different systems’ resources, such as WWW data, massive
open online courses that make use of many systems to deliver the
learning materials [5,18].
Recently, analysis of educational data, for instance learning
analytics, academic analytics, educational data mining, predictive
analytics and learners’ analytics has emerged as an innovative
area of research [22]. The commonality between all of these terms
is the use of educational data for multiple purposes. Recently, a
new term has been introduced: ‘educational data science’ that
clarifies how different disciplines and researchers with different
research interests and backgrounds can work in this area [15].
415