International Journal of Data Science and Artificial Intelligence (IJDSAI)
Volume 02, Issue 03, May – June (2024)
ISSN: 2584-1041 ©KITS PRESS Publications
RESEARCH ARTICLE
SEMANTIC FEATURE ENABLED
AGGLOMERATIVE CLUSTERING FOR
INFORMATION TECHNOLOGY JOB PROFILE
ANALYSIS
B. Jaison
1, *
, R. Gladis Kiruba
2
and G Belshia Jebamalar
3
1
Department of Computer Science and Engineering, R.M.K. Engineering College, Kavaraipettai-601206 India.
2
Department of Electronics and Communication Engineering and AIML, Bangalore College of Engineering and
Technology, Chandapura, Bengaluru, India.
3
Department of Computer Science Engineering, S.A Engineering College, Thiruverkadu, Tamil Nadu 600077 India.
*
Corresponding e-mail: bjn.cse@rmkec.ac.in
Abstract – The maintenance and implementation of computer
systems are the core activities of information technology.
Database administration and network architecture are also
included in information technology. Professionals have access
to a working environment that facilitates the setup of internal
networks and the development of computer systems. There is an
immediate need for a suitable approach to close the gap between
supply and demand for IT workers. Extensive research into IT
job profiles is crucial to meeting industry demands. Educational
programs must identify the abilities that the industry requires
to modernize its manufacturing. Semantic Feature-Enabled
Agglomerative Clustering for Information Technology Job
Profiling (SEA-IT) has been proposed to overcome these
challenges. Semantic analysis is performed using a tree-like
strategy. The most frequently used phrases and words from
each cluster of IT professions were collected to demonstrate
specific knowledge. Initially, the data from the online job
posting sources will be collected and pre-processed using
techniques such as stemming, normalization, text correction,
removing stop words, and tokenization. Secondly, the pre-
processed data can extract features using a bag of words. After
feature extraction, the cluster is generated using an
agglomerative algorithm to form an IT job analysis result, so
that the knowledge and capabilities of IT professionals can be
upgraded. The simulation findings, based on evaluation criteria
and other statistical tests, demonstrated the suggested
algorithm. Experiments demonstrated that SEA-IT functions
well with a variety of descriptive methodologies and is
independent of the dataset's dimensions.
Keywords – Information Technology, Preprocessing, bag of words,
agglomerative algorithm.
1. INTRODUCTION
Informatics engineering is a subject that teaches students
the principles of computer science and mathematical analysis
in order to create, build, test, and evaluate software [1].
Database administrators, data scientists, UI designers, IT
project managers, network engineers, UX designers, and
other professions in the field of informatics technology are
all open to graduates of the Department of Informatics
Engineering with a computer science degree [2]. It differs
from other majors in that a student with a doctoral degree in
education will work as a doctor, a student with a nursing
degree will work as a nurse, a student with a pharmacy
degree will work as a pharmacist, a student with an education
degree will work as a teacher, and so on [3].
However, students studying informatics engineering
won't be sufficient if they merely have a good education
without complementing talents [4]. Students majoring in
informatics engineering must broaden their skill sets in order
to meet the requirements for expert workers in certain IT
domains [5]. Career seekers must be aware of the specialized
knowledge and skills required for each IT career field from
the wide variety of IT job fields [6]. The scarcity of graduates
prepared for the workplace in informatics technology areas
is caused by the fact that informatics engineering students
frequently do not know the benchmark of their capacity for
work skill requirements in IT industries [7].
Application system suggestion careers in IT sectors
based on IT knowledge and abilities are therefore essential
[8]. Even though each student who completes the Informatics
Engineering program is equal, they all have unique
knowledge and talents [9]. The terms and phrases that are
most frequently used in the information technology sector to
represent skills and knowledge as a new dataset serve as the
foundation for the model's functionality. Ten IT
professionals from various commercial and government
enterprises in Indonesia confirmed the skills needed for each
position through focus group discussions (FGD) [10]. The
main contribution of the suggested method is as follows:
• Initially, the data from the online job posting
sources will be collected and pre-processed using