Received: 23 July 2021 Revised: 26 December 2021 Accepted: 27 January 2022 DOI: 10.1002/cpe.6911 RESEARCH ARTICLE A novel feature selection with stochastic gradient descent logistic regression for multilabeled stress prediction in working employees Swaminathan Anitha Muthuraman Vanitha Department of Computer Applications, Alagappa University, Karaikudi, India Correspondence Swaminathan Anitha, Department of Computer Applications, Alagappa University, Karaikudi, India. Email: nathan.anitha@gmail.com Abstract In the recent times, stress prediction becomes a hot research area and several research works have been developed to address it. The advent of machine learning (ML) mod- els assists the stress prediction process to understand the patterns effectively and offer effective perceptions about possible future intervention. In this view, this article presents a multi-labeled stress prediction in working employee using extremely ran- domized tree (ET) based feature selection (FS) and stochastic gradient descent (SGD) with logistic regression (LR), called ETSGD-LR model. First, the ET based FS technique can be used to compute impurity-based feature importance, which in turn can be used to discard irrelevant features. In addition, the SGD-LR model is used to classify the fea- ture reduced subset into different class labels. For experimental validation, we have collected our own stress prediction dataset with 1197 records of employees collected from schools, banks, universities, and so forth from different institutions. Among them 1197 records are filtered with various diseases and work pressure. A detailed set of simulations were carried out in Python Programming tool, and the results are analyzed in terms of sensitivity, specificity, accuracy, precision, F-score, and kappa. The obtained simulation outcome ensured the superior performance of the ETSGD-LR model over the compared methods with the maximum sensitivity, specificity, and accuracy of 0.980, 0.900, and 0.972, respectively. The experimental results shown that the inclusion of FS process helps to improve the classification performance. KEYWORDS data classification, feature selection, logistic regression, machine learning, stress prediction 1 INTRODUCTION Basically, humans have stress and depression due to various reasons which affect a person physically and mentally that results in serious con- sequences. Building construction is referred as one of the stressful sectors as it requires physical and mental performance of a person in harsh platforms. 1 Based on the study developed, 2 work stress leads to enhance the threat factors like accidents, health issues, lack of concentration, stagnation, and poor performance. The above-mentioned constraints would directly affect the building works. Also, it is reported that number of employees working in construction sector has maximum stress level. Followed by, competitors have developed in this application, as the intelligent technology has numerous benefits like latest models, trending lifestyles, and alternate services. Politicians, managers, and subsidiaries are con- centrated to achieve a competing edge within the organizations. Hence, workers have made maximum efforts to accomplish the above-mentioned Concurrency Computat Pract Exper. 2022;e6911. wileyonlinelibrary.com/journal/cpe © 2022 John Wiley & Sons, Ltd. 1 of 16 https://doi.org/10.1002/cpe.6911