International Journal of Data Science and Artificial Intelligence (IJDSAI) Volume 02, Issue 03, May – June (2024) ISSN: 2584-1041 ©KITS PRESS Publications RESEARCH ARTICLE SEMANTIC FEATURE ENABLED AGGLOMERATIVE CLUSTERING FOR INFORMATION TECHNOLOGY JOB PROFILE ANALYSIS B. Jaison 1, * , R. Gladis Kiruba 2 and G Belshia Jebamalar 3 1 Department of Computer Science and Engineering, R.M.K. Engineering College, Kavaraipettai-601206 India. 2 Department of Electronics and Communication Engineering and AIML, Bangalore College of Engineering and Technology, Chandapura, Bengaluru, India. 3 Department of Computer Science Engineering, S.A Engineering College, Thiruverkadu, Tamil Nadu 600077 India. * Corresponding e-mail: bjn.cse@rmkec.ac.in Abstract – The maintenance and implementation of computer systems are the core activities of information technology. Database administration and network architecture are also included in information technology. Professionals have access to a working environment that facilitates the setup of internal networks and the development of computer systems. There is an immediate need for a suitable approach to close the gap between supply and demand for IT workers. Extensive research into IT job profiles is crucial to meeting industry demands. Educational programs must identify the abilities that the industry requires to modernize its manufacturing. Semantic Feature-Enabled Agglomerative Clustering for Information Technology Job Profiling (SEA-IT) has been proposed to overcome these challenges. Semantic analysis is performed using a tree-like strategy. The most frequently used phrases and words from each cluster of IT professions were collected to demonstrate specific knowledge. Initially, the data from the online job posting sources will be collected and pre-processed using techniques such as stemming, normalization, text correction, removing stop words, and tokenization. Secondly, the pre- processed data can extract features using a bag of words. After feature extraction, the cluster is generated using an agglomerative algorithm to form an IT job analysis result, so that the knowledge and capabilities of IT professionals can be upgraded. The simulation findings, based on evaluation criteria and other statistical tests, demonstrated the suggested algorithm. Experiments demonstrated that SEA-IT functions well with a variety of descriptive methodologies and is independent of the dataset's dimensions. Keywords – Information Technology, Preprocessing, bag of words, agglomerative algorithm. 1. INTRODUCTION Informatics engineering is a subject that teaches students the principles of computer science and mathematical analysis in order to create, build, test, and evaluate software [1]. Database administrators, data scientists, UI designers, IT project managers, network engineers, UX designers, and other professions in the field of informatics technology are all open to graduates of the Department of Informatics Engineering with a computer science degree [2]. It differs from other majors in that a student with a doctoral degree in education will work as a doctor, a student with a nursing degree will work as a nurse, a student with a pharmacy degree will work as a pharmacist, a student with an education degree will work as a teacher, and so on [3]. However, students studying informatics engineering won't be sufficient if they merely have a good education without complementing talents [4]. Students majoring in informatics engineering must broaden their skill sets in order to meet the requirements for expert workers in certain IT domains [5]. Career seekers must be aware of the specialized knowledge and skills required for each IT career field from the wide variety of IT job fields [6]. The scarcity of graduates prepared for the workplace in informatics technology areas is caused by the fact that informatics engineering students frequently do not know the benchmark of their capacity for work skill requirements in IT industries [7]. Application system suggestion careers in IT sectors based on IT knowledge and abilities are therefore essential [8]. Even though each student who completes the Informatics Engineering program is equal, they all have unique knowledge and talents [9]. The terms and phrases that are most frequently used in the information technology sector to represent skills and knowledge as a new dataset serve as the foundation for the model's functionality. Ten IT professionals from various commercial and government enterprises in Indonesia confirmed the skills needed for each position through focus group discussions (FGD) [10]. The main contribution of the suggested method is as follows: • Initially, the data from the online job posting sources will be collected and pre-processed using