ANALYSIS OF STUDENT ACADEMIC PERFORMANCE USING CLUSTERING TECHNIQUES K. Govindasamy 1 , T.Velmurugan 2 1 Research Scholar, VELS University, Chennai, India. 2 Associate Professor, PG and Research Department of Computer Science, D. G. Vaishnav College, Chennai, India. E-Mail: 1 mphilgovind@gmail.com, 2 velmurugan_dgvc@yahoo.co.in Abstract: Student‟s performance is an essential part in higher learning institutions. Predicting student‟s performance becomes more challenging due to the large volume of data in educational databases. Clustering is one of the method in data mining to analyze the massive volume of data. It categorizes data into clusters such that objects are grouped in the same cluster when they are similar according to specific metrics. This paper is designed to study and compare four clustering algorithms. The algorithms used for the research is k-Means, k-Medoids, Fuzzy C Means (FCM) and Expectation Maximization (EM). The main advantage of clustering is that interesting patterns and structures can be found directly from very large data sets with little or none of the background knowledge. The performance of the clustering algorithms is compared based on the factors: Purity, Normalized mutual information(NMI) and time taken to form cluster. Keywords: Educational Data Mining, k-Means Algorithm, k-Medoids Algorithm, Fuzzy C Means Algorithm, Expectation Maximization Algorithm. 1. Introduction Data mining is a process of extracting previously unknown, valid, potential useful and hidden patterns from large data sets. As the amount of data stored in educational data bases is in increasing rapidly. In order to get required benefits from such large data and to find hidden relationships between variables using different data mining techniques developed and used. Clustering is most widely used techniques in data mining. The aim of clustering is to partition students in to homogeneous groups according to their characteristics and abilities [1]. Usually educational organizations used to collect huge amount of data which would be relevant to faculty members, students, etc. But the importance of data that is collected is unknown. The data that are used in generating simple queries or traditional reports may be in significant, which will not contribute to the process of inference/decision making in the educational organizations. The collected data may also contain such insignificant data. Also the volume and complexity of the collected data may be very high such that it is not easy to handle. If that is the case then the collected data may not be used and memory is occupied unnecessarily. The available data can be made usable if and only if it is converted into useful information by exploiting potentiality of the collected data. A wide range of data mining algorithms is used to extract useful information from potential data gathered in various educational organizations. International Journal of Pure and Applied Mathematics Volume 119 No. 15 2018, 309-323 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/ 309