An Heuristic Feature Selection Algorithm to
Evaluate Academic Performance of Students
Samuel-Soma M. Ajibade
Computer Science Department,
UTM
Johor Bahru, Malaysia
samuel.soma@yahoo.com
Nor Bahiah Ahmad
Computer Science Department
UTM
Johor Bahru, Malaysia
bahiah@utm.my
Siti Mariyam Shamsuddin
Computer Science Department
UTM
Johor Bahru, Malaysia
Mariyam@utm.my
Abstract—The value of schooling and academic performance
of student is the topmost priority of all academic institutions.
Educational Data Mining (EDM) is an evolving area of research
which aids academic institutions to enhance their student’s
performances. Feature Selection algorithms eradicates inapt and
unrelated data from the dataset, thereby increasing the classifiers
performances that are utilized in EDM. This aim of this paper is
to evaluate the performance of students utilizing a heuristic
technique known as Differential Evolution for feature selection
algorithms on the dataset of students and some other feature
selection algorithms have also been used which have never been
used before on the dataset. Also, classification techniques such as
Naïve Bayes (NB), Decision Tree (DT), K-Nearest Neighbor
(KNN) and Discriminant Analysis (DISC) were used to evaluate.
The Differential Evolution (DE) algorithm is proposed as a better
feature selection algorithm for evaluating the academic
performance of students and this gave a better accuracy than
other feature selection algorithm that were used. The outcome of
the different feature selection algorithms and classification
techniques will help researchers to find the finest combinations of
the classifiers and feature selection algorithms. This paper is a
step towards playing an important role in enhancing the
standard of education in academic institutions and also to
carefully guide researchers in strategically interfering in
academic issues.
Keywords—educational data mining, differential evolution,
student performance, feature selection algorithm, classification
algorithm
I. INTRODUCTION
Educational Data Mining, is a field of methodical review
and analysis which depends on the enhancement of techniques
which is implied not just for information disclosure that is
inside the particular sorts of data that is gotten from various
educational settings yet in addition for utilizing those
techniques afterwards to efficiently comprehend the learners
as well as the surroundings that they learn in, has appeared as
a free research field as of late [1].
Feature Selection is a functioning and dynamic area of
study which comprises of machine learning and data mining.
The process of feature selection is performed so as to choose a
subclass by removing non prescient information. Likewise, the
accuracy of performance prediction is upsurged and
diminishes the multifaceted nature of academic outcomes [2,
3, 4]. When the feature techniques are utilized, then the
efficiency of prediction model is enhanced.
Feature selection has been effectively connected to
numerous fields, for example, text categorization, face
recognition, cancer classification, gene classification,
recommender system The entire space of exploring consists of
all the possible subsets of features and this suggests that the
request space measure is 2n where n is the quantity of the real
features. Hence, the issue of finding the optimum feature
subset is a NP-hard issue [5,6]. In previous works, feature
selection algorithms have been applied to predict student
performance classifier accuracy, but the use of heuristic
algorithm is still relatively low, hence the accuracy
performances has been low. In this paper, the DE has been
introduced as a better feature selection algorithm for
evaluating the performance of students. The DE is best known
for reducing computation time and increases accuracy of
classifiers. This work attempts to distinguish the best
combinations of feature section techniques and classification
methods on the dataset of students. Also, the low
performances have been accredited to the inadequate use of
variables as well as singular use of base classifier and as such
the use of heuristic algorithms in prediction of performance
has been recommended [11]. Section II below discusses some
reviews of literatures, while section III describes the
methodology used. In section IV, the results are discussed, and
section V draws the conclusion of this work.
II. LITERATURE REVIEW
The Differential Evolution (DE) is a masses-based algorithm
that can be seen as like Genetic Algorithm (GA) since it uses
operators such as: crossover, mutation and selection. The rule
differentiate among DE and GA is in structuring better goals,
where DE relies upon the operation of mutation and GA relies
upon the operation of crossover. This DE was founded by
Storn and Price in 1997 [7], Who use a certifiable number
enhancer and utilizes the operators of DE to the arrangements
of the features that makes comparative features to be practiced
on different events in the solution vector.
A hybrid approach that links Artificial Bee Colony (ABC)
optimization and DE together is suggested by [8] and DE
2019 IEEE 10th Control and System Graduate Research Colloquium (ICSGRC 2019), 2 - 3 August 2019, Shah Alam, Malaysia
978-1-7281-0755-4/19/$31.00 ©2019 IEEE 110
Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on April 28,2021 at 03:59:04 UTC from IEEE Xplore. Restrictions apply.