Big Data in Academia: A Proposed Framework for Improving Students Performance Imran Rashid Banday 1 , Majid Zaman 2* , Syed Mohammad Khurshid Quadri 3 , Sheikh Amir Fayaz 1 , Muheet Ahmed Butt 1 1 Department of Computer Sciences, University of Kashmir, J&K 190006, India 2 Directorate of IT&SS, University of Kashmir, J&K 190006, India 3 Department of Computer Science, Jamia Millia Islamia, New Delhi 110025, India Corresponding Author Email: zamanmajid@gmail.com https://doi.org/10.18280/ria.360411 ABSTRACT Received: 16 April 2022 Accepted: 2 August 2022 The way people learn has radically changed as a result of information technology. As an informal method of learning, fragmented learning has become a popular way to learn new technology and expertise. Academic organizations generate a large amount of heterogeneous data, and academic leaders want to make the most of it by analyzing the large amount of data in order to make better decisions. The volume isn't the only issue; the organization's data structure (structured, semi structured, and unstructured) adds to the complexity of academic work and decision-making on a daily basis. As big data has become more prevalent in educational settings, new data-driven techniques to enhance informed decision-making and efforts to improve educational efficacy have emerged. Traditional data sources and approaches were previously too expensive to obtain with digital traces of student behaviour, which offer more scalable and finer-grained comprehension and support of learning processes. This study provides a fragmented learning solution for students in a data environment that can suggest subjects to them based on their geographical location, gender, and district of residence, among other factors. This suggested framework is expected to play a key role in directing the development of a society that values lifelong learning. Keywords: big data, education, subject recommendation, heterogeneous sources, information technology, Kashmir university 1. INTRODUCTION In recent years, the IT sector has seen a tremendous growth in the volume of data created, owing mostly to Internet services, prompting a rethinking of the word database. Big Data is a new term for the description and management of massive amounts of structured and unstructured data produced by businesses, organizations, and social media settings [1]. Education systems like Universities operate in an increasingly complicated and competitive environment. To respond to national and global economic, political, and social developments, they must compete with other institutions. Furthermore, various stakeholders want Universities to develop appropriate solutions to these needs in a timely way. To address this issue, Universities must develop the appropriate judgments for coping with these rapid changes by examining large data sources that have been generated [2, 3]. The majority of Universities institutions spend a significant amount of money on information technology in order to construct a data warehouse system. Because of the rapid pace of technological advancements, as well as the emergence of new web-based learning modes results in the information and communication technology revolution. It has increased a competition among higher education institutions, new technological solutions to improve the quality of higher education have emerged from various IT solution providers such as SAP, Cisco, Microsoft, and others, who have provided a parallel universe of IT's [4, 5]. Higher education institutions, such as universities, must deal with massive amounts of data from various sources for accreditation purposes, including data generated from online transactions, videos, audios, images, emails, social media, click streams, logs, posts, search queries, social networking interactions, science data, mobile phone applications, and data stored on multiple operating systems [6]. To meet accreditation criteria, this data is normally kept, sorted, retrieved, and analyzed in a traditional format. Traditional database approaches and tools, on the other hand, are incapable of efficiently processing large amounts of data [7]. The growing number of data repositories in universities necessitates the management of internal data. The data's heterogeneity is owing to the various types of data, which include various streams, courses, and structures. The University of Kashmir has been collecting data and working on data automation and integration since 2002, resulting in a wide range of data repositories and hence Big Data analytics. The heterogeneity, redundancy, and inconsistency in the current set of data are the key reasons for implementing the big data idea at University of Kashmir [8, 9]. This paper is structured as: A brief overview of big data is provided in this section .A generalized review of literature is described in section 2. Levels of Kashmir University education system is defined in which academic data is analyzed and the various course structures are provided in section 3, while in section 4, a manual approach of experimental analysis has been provided. In section 5 many resolutions and issues using Big Data in University system are eloborated. Finally section 5 concludes the paper. Revue d'Intelligence Artificielle Vol. 36, No. 4, August, 2022, pp. 589-595 Journal homepage: http://iieta.org/journals/ria 589