An Heuristic Feature Selection Algorithm to Evaluate Academic Performance of Students Samuel-Soma M. Ajibade Computer Science Department, UTM Johor Bahru, Malaysia samuel.soma@yahoo.com Nor Bahiah Ahmad Computer Science Department UTM Johor Bahru, Malaysia bahiah@utm.my Siti Mariyam Shamsuddin Computer Science Department UTM Johor Bahru, Malaysia Mariyam@utm.my AbstractThe value of schooling and academic performance of student is the topmost priority of all academic institutions. Educational Data Mining (EDM) is an evolving area of research which aids academic institutions to enhance their student’s performances. Feature Selection algorithms eradicates inapt and unrelated data from the dataset, thereby increasing the classifiers performances that are utilized in EDM. This aim of this paper is to evaluate the performance of students utilizing a heuristic technique known as Differential Evolution for feature selection algorithms on the dataset of students and some other feature selection algorithms have also been used which have never been used before on the dataset. Also, classification techniques such as Naïve Bayes (NB), Decision Tree (DT), K-Nearest Neighbor (KNN) and Discriminant Analysis (DISC) were used to evaluate. The Differential Evolution (DE) algorithm is proposed as a better feature selection algorithm for evaluating the academic performance of students and this gave a better accuracy than other feature selection algorithm that were used. The outcome of the different feature selection algorithms and classification techniques will help researchers to find the finest combinations of the classifiers and feature selection algorithms. This paper is a step towards playing an important role in enhancing the standard of education in academic institutions and also to carefully guide researchers in strategically interfering in academic issues. Keywords—educational data mining, differential evolution, student performance, feature selection algorithm, classification algorithm I. INTRODUCTION Educational Data Mining, is a field of methodical review and analysis which depends on the enhancement of techniques which is implied not just for information disclosure that is inside the particular sorts of data that is gotten from various educational settings yet in addition for utilizing those techniques afterwards to efficiently comprehend the learners as well as the surroundings that they learn in, has appeared as a free research field as of late [1]. Feature Selection is a functioning and dynamic area of study which comprises of machine learning and data mining. The process of feature selection is performed so as to choose a subclass by removing non prescient information. Likewise, the accuracy of performance prediction is upsurged and diminishes the multifaceted nature of academic outcomes [2, 3, 4]. When the feature techniques are utilized, then the efficiency of prediction model is enhanced. Feature selection has been effectively connected to numerous fields, for example, text categorization, face recognition, cancer classification, gene classification, recommender system The entire space of exploring consists of all the possible subsets of features and this suggests that the request space measure is 2n where n is the quantity of the real features. Hence, the issue of finding the optimum feature subset is a NP-hard issue [5,6]. In previous works, feature selection algorithms have been applied to predict student performance classifier accuracy, but the use of heuristic algorithm is still relatively low, hence the accuracy performances has been low. In this paper, the DE has been introduced as a better feature selection algorithm for evaluating the performance of students. The DE is best known for reducing computation time and increases accuracy of classifiers. This work attempts to distinguish the best combinations of feature section techniques and classification methods on the dataset of students. Also, the low performances have been accredited to the inadequate use of variables as well as singular use of base classifier and as such the use of heuristic algorithms in prediction of performance has been recommended [11]. Section II below discusses some reviews of literatures, while section III describes the methodology used. In section IV, the results are discussed, and section V draws the conclusion of this work. II. LITERATURE REVIEW The Differential Evolution (DE) is a masses-based algorithm that can be seen as like Genetic Algorithm (GA) since it uses operators such as: crossover, mutation and selection. The rule differentiate among DE and GA is in structuring better goals, where DE relies upon the operation of mutation and GA relies upon the operation of crossover. This DE was founded by Storn and Price in 1997 [7], Who use a certifiable number enhancer and utilizes the operators of DE to the arrangements of the features that makes comparative features to be practiced on different events in the solution vector. A hybrid approach that links Artificial Bee Colony (ABC) optimization and DE together is suggested by [8] and DE 2019 IEEE 10th Control and System Graduate Research Colloquium (ICSGRC 2019), 2 - 3 August 2019, Shah Alam, Malaysia 978-1-7281-0755-4/19/$31.00 ©2019 IEEE 110 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on April 28,2021 at 03:59:04 UTC from IEEE Xplore. Restrictions apply.