ech T Press Science Computers, Materials & Continua DOI:10.32604/cmc.2022.021666 Article Robust Length of Stay Prediction Model for Indoor Patients Ayesha Siddiqa 1 , Syed Abbas Zilqurnain Naqvi 1 , Muhammad Ahsan 1 , Allah Ditta 2 , Hani Alquhayz 3 , M. A. Khan 4 and Muhammad Adnan Khan 5, * 1 Department of Mechatronics and Control Engineering, University of Engineering and Technology, Lahore, 54000, Pakistan 2 Department of Information Sciences, Division of Science and Technology, University of Education, Lahore, 54000, Pakistan 3 Department of Computer Science and Information, College of Science in Zulf, Majmaah University, Al-Majmaah, 11952, Saudi Arabia 4 Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore, 54000, Pakistan 5 Pattern Recognition and Machine Learning Lab, Department of Software, Gachon University, Seongnam, 13557, Korea * Corresponding Author: Muhammad Adnan Khan. Email: adnan@gachon.ac.kr Received: 08 July 2021; Accepted: 09 August 2021 Abstract: Due to unforeseen climate change, complicated chronic diseases, and mutation of viruses’ hospital administration’s top challenge is to know about the Length of stay (LOS) of different diseased patients in the hospitals. Hospital management does not exactly know when the existing patient leaves the hospital; this information could be crucial for hospital management. It could allow them to take more patients for admission. As a result, hospitals face many problems managing available resources and new patients in get- ting entries for their prompt treatment. Therefore, a robust model needs to be designed to help hospital administration predict patients’ LOS to resolve these issues. For this purpose, a very large-sized data (more than 2.3 mil- lion patients’ data) related to New-York Hospitals patients and containing information about a wide range of diseases including Bone-Marrow, Tuber- culosis, Intestinal Transplant, Mental illness, Leukaemia, Spinal cord injury, Trauma, Rehabilitation, Kidney and Alcoholic Patients, HIV Patients, Malig- nant Breast disorder, Asthma, Respiratory distress syndrome, etc. have been analyzed to predict the LOS. We selected six Machine learning (ML) mod- els named: Multiple linear regression (MLR), Lasso regression (LR), Ridge regression (RR), Decision tree regression (DTR), Extreme gradient boost- ing regression (XGBR), and Random Forest regression (RFR). The selected models’ predictive performance was checked using R square and Mean square error (MSE) as the performance evaluation criteria. Our results revealed the superior predictive performance of the RFR model, both in terms of RS score (92%) and MSE score (5), among all selected models. By Exploratory data analysis (EDA), we conclude that maximum stay was between 0 to 5 days with the meantime of each patient 5.3 days and more than 50 years old patients spent more days in the hospital. Based on the average LOS, results revealed that the patients with diagnoses related to birth complications spent more days in the hospital than other diseases. This fnding could help predict the This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.