(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 9, No. 11, 2018 305 | Page www.ijacsa.thesai.org A Novel Student Risk Identification Model using Machine Learning Approach Nityashree Nadar 1 Bharathiar University, Coimbatore Dr.R.Kamatchi 2 Amity School of Engineering and Technology Panvel, Mumbai AbstractThis research work aim at addressing issues in detecting student, who are at risk of failing to complete their course. The conceptual design presents a solution for efficient learning in non-existence of data from previous courses, which are generally used for training state-of-art machine learning (ML) based model. The expected scenario usually occurs in scenario when university introduces new courses for academics. For addressing this work, build a novel learning model that builds a ML from data constructed from present course. The proposed model uses data about already submitted task, which further induces the issues of imbalanced data for both training and testing the classification model. The contribution of the proposed model are: the design of training the learning model for detecting risk student utilizing information from present courses, tackling challenges of imbalanced data which is present in both training and testing data, defining the issues as a classification task, and lastly, developing a novel non-linear support vector machine (NL-SVM) classification model. Experiment outcome shows proposed model attain significant outcome when compared with state-of-art model. KeywordsClassification; imbalanced data; machine learning; virtual learning environment I. INTRODUCTION Student dropout is an important problem across various levels such primary school, higher secondary, graduation level and the scenario is much worse in Massive Open Online Courses (MOOCs). As per the research conducted in [1], [2], the number of student not completing graduation in USA is 20% and in Europe it is around 20% to 50% fail to finish their studies on time [3]. For online or distance education, these statistics are even worse with 78% of students not completing the graduation [4]. Further, it gets even worse for student who gets registered with MOOCs, the percentage of student who enrolled and successfully finished the course is only 5% as reported in [5] or 15% as reported in [6]. The issues of identifying student that re expected to fail the course has been extensively analyzed across various research community in recent times [7], [8], [9]. It was also a major subject of the KDD'CUP 2015 competition that mainly aimed on forecasting student withdrawing from online courses. Establishing student, who are at chance or risk of withdrawing or failing from their respective course, is the initial step towards provisioning them with remedial (material) support. Generally, supportive measures are carried out by instructor/professor, who obtains the information/outcome of forecasting [7], [8]. In other way, the forecasting model may build email messages that communicate directly to the student [10]. The preliminary objective is to enhance the student learning, to keep student engaged in course, and aid them completing the research or study programs. In distance or online courses, most material are delivered through Virtual Learning Environment (VLE). In VLE each and action are recorder and stored. Along with, student information such as assessment, task results, and demographic information etc, are also kept. These data are cleansed and ML is applied to build a forecasting/predictive model. These model are then used to offers online course provider to forecast student at-risk of completing it on time. A generic way of building a predictive model is to train the models using legacy data from a history or previous task submitted information of the course [8]. Further, it is applied to the present presentation. However, adopting these methods will not be efficient when applied to new type of courses that has no history. For such case, it is important to find new solution. From extensive survey carried out by MOOCs [11] and Higher Education (HE) courses [8] shows that the highest amount of dropout occurs during first year’s courses, and many student dropouts even with a month, first few weeks of the course presentation. The cause may be also due to fee payment toward courses. Therefore, the objectives are to establish or find student who are at-risk of dropping out or failing to complete on time as early as possible. It must also be noted that the same behavior or pattern may not be same across different university/education institution or course design, rapid student dropping out of course may also arise in late stage of course [9]. For overcoming research challenges this work, this work aimed at designing a forecasting model that identify student at- risk of failing or completing on time by presenting a novel non-linear support vector machine (NLSVM) classification model. The contribution of work is as follows Presenting a non-linear enhanced support vector machine classification model for identifying student risk of failure. The NLSVM can be used as both binary classifier as well as multi-level classifier. Our model attain good accuracy performance when compared with state-of-art model. Experiment outcome shows good performance in terms of ROC, F-measure, and precision and recall.