Text Analysis Based Human Resource
Productivity Profiling
Basudev Pradhan
(B )
, Siddharth Swarup Rautaray, Amiya Ranjan Panda,
and Manjusha Pandey
School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, India
{basudev.pradhan,siddharthfcs,amiya.pandafcs,
manjushafcs}@kiit.ac.in
Abstract. Email being an efficient, cost-effective, real-time communication mode
results into effective productivity among the professional in the organization. It
constitutes almost 90% of daily office procedures in organizations, hence the
productivity of organizations depends heavily on the text communicated in emails.
The presented research work focuses on email profiling in organizations based on
mail text interpretation and analysis. In the proposed work we will be working
on datasets containing email communication of ENRON Corporation as test case.
The profiling would be done using Text interpretation and analysis algorithm using
machine learning algorithms. The BoW will be implemented to analyze and predict
the characteristics of incoming and outgoing emails, then these could be mapped
and profiled as per the behavior of employees into 3 categories of productive based
on positive responses, neutral and non-productive based on negative responses.
Keywords: Email profiling · text interpretation and analysis · ENRON dataset ·
machine learning · Bag of Words
1 Introduction
Professionals in an Organization may be profiled based on the analysis of text used by
them in the email communications while performing their role for contribution to orga-
nization’s goal, as almost 90% of the communication in the current age of digital data
transformation is in email format [1]. The characteristics of words in email communi-
cation indicates about the behavior and attitude of professional and is a reflection of
how effectively the professional works. Many research and existing labor data indicates
the professional using positive terms in their email communication work effectively and
efficiently and tend to make the work environment more favorable for their co-workers.
The presented research work focuses on identification and categorization of choice of
words used by different professionals of the organization in their email communication.
The email profiling would be done based on mail text analytics using BoW machine
learning algorithm. The input data taken from ENRON dataset will be preprocessed
for the removal of email text formalities and the information is integrated in a weighted
© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2024
Published by Springer Nature Switzerland AG 2024. All Rights Reserved
P. Pareek et al. (Eds.): IC4S 2023, LNICST 536, pp. 254–262, 2024.
https://doi.org/10.1007/978-3-031-48888-7_21