Text Analysis Based Human Resource Productivity Profiling Basudev Pradhan (B ) , Siddharth Swarup Rautaray, Amiya Ranjan Panda, and Manjusha Pandey School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, India {basudev.pradhan,siddharthfcs,amiya.pandafcs, manjushafcs}@kiit.ac.in Abstract. Email being an efficient, cost-effective, real-time communication mode results into effective productivity among the professional in the organization. It constitutes almost 90% of daily office procedures in organizations, hence the productivity of organizations depends heavily on the text communicated in emails. The presented research work focuses on email profiling in organizations based on mail text interpretation and analysis. In the proposed work we will be working on datasets containing email communication of ENRON Corporation as test case. The profiling would be done using Text interpretation and analysis algorithm using machine learning algorithms. The BoW will be implemented to analyze and predict the characteristics of incoming and outgoing emails, then these could be mapped and profiled as per the behavior of employees into 3 categories of productive based on positive responses, neutral and non-productive based on negative responses. Keywords: Email profiling · text interpretation and analysis · ENRON dataset · machine learning · Bag of Words 1 Introduction Professionals in an Organization may be profiled based on the analysis of text used by them in the email communications while performing their role for contribution to orga- nization’s goal, as almost 90% of the communication in the current age of digital data transformation is in email format [1]. The characteristics of words in email communi- cation indicates about the behavior and attitude of professional and is a reflection of how effectively the professional works. Many research and existing labor data indicates the professional using positive terms in their email communication work effectively and efficiently and tend to make the work environment more favorable for their co-workers. The presented research work focuses on identification and categorization of choice of words used by different professionals of the organization in their email communication. The email profiling would be done based on mail text analytics using BoW machine learning algorithm. The input data taken from ENRON dataset will be preprocessed for the removal of email text formalities and the information is integrated in a weighted © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2024 Published by Springer Nature Switzerland AG 2024. All Rights Reserved P. Pareek et al. (Eds.): IC4S 2023, LNICST 536, pp. 254–262, 2024. https://doi.org/10.1007/978-3-031-48888-7_21