International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Impact Factor (2012): 3.358 Volume 3 Issue 11, November 2014 www.ijsr.net Licensed Under Creative Commons Attribution CC BY A Mining Method to Predict Patient’s DOSH Ruchi Rathor 1 , Pankaj Agarkar 2 1 Department of Computer Engineering, Dr D.Y.Patil School of Engineering, Savitribai Phule Pune University, India 2 Professors, Department of Computer Engineering, Dr D.Y.Patil School of Engineering, Savitribai Phule Pune University, India Abstract: Management of hospital resources is a major and composite activity that can highly influence the work rate of the hospital’s services. Mostly, resource management is needed when patients are hospitalized, as large amount of resources are under exhaustion at that specific time. Hence, predicting the number of days a patient stays at the hospital can help in organizing the hospital resources. In this paper we propose a prediction model that predicts the Duration of Stay at the Hospital (DOSH) by the patient. We used basic clustering and classification method for the prediction. In this methodology, hidden anomalies and deviations can be brought out which cannot be featured by applying basic clustering, that will give an efficient prediction outgrowth. Keywords: clustering, classification, prediction 1. Introduction One of the most important factor over which the productivity of the hospital’s service is totally dependent, is the management of the hospital resources. These resource comprises of beddings, medicines, food etc. management of the hospital resources is required so that (a) resources that are misused and wasted, can be prevented; (b) resource can be utilized efficiently; (c) better estimation of the resources can be performed; (d) future resource demands can be planned efficiently; (e) future appointments can be handled; (f) proper medical services can be served to the patients and (f) can result into high occupancy rate[1] [4]. Hence, managing the resources will enhance the productivity of the hospital and quality of life, of the patient [2]. This paper proposes a DOSH prediction model that will predict the number of days a patient will stay at the hospital and based on that how much amount of resources will be required by the patient will be estimated and reserved for that respective patient. The prediction is made three times, beginning with when patient arrives to consult the doctor, second at the admission of the patient and third based on the condition of the patient. If the condition of the patient progresses as predicted then no further predictions are made but if the condition deteriorates or doesn’t thrive, the possible second prediction or further third predictions are made. 2. Related Work By knowing, the need for Duration of Stay of the Patient in the Hospital exorbitant research been conducted and studied. Since 1960s many prediction models are being created that can predict the Duration of stay (LOS) of the patient in the hospital and their result being compared. D. H. Gustafson [3] proposed and compared five prediction methodologies out of which three resulted with point estimation based on physician’s estimation whereas other two couldn’t perform well and gave poor precision as the data collected was inadequate and estimation couldn’t be performed. E. K. Kulinskaya and H. D. Gao [2] investigated the factors that can affect the prediction of the LOS of the patient. The data used was a statistical diagnosed data. As statistical LOS data does not give normal distribution and consist of lots of outliers, they proposed a robust statistical methodology on order to handle outliers and a normal distribution can be formed. To analyze effect of factors on LOS prediction two methods are compared that worked over the data .One is Standard Method: General Linear Models (GML) and Robust Method: Truncated Maximum Likelihood (TML). Out of the two TLM proved to be better estimator of the prediction. But the accuracy of the prediction made is not compared with the actual one. V. Liu, P. Kipnis, M. K. Gould, and G. J. Escobar [5] predicted LOS based on linear regression model and data set from 17 hospitals with total of 205,177 hospitalizations. In addition, they added Laboratory Acute Physiology Score (LAPS) and Co-morbidity point score (COPS) to linear regression model gave improved outcome. Ali Azari, Vandana P. Janeja , Alex Mohseni [4] to cut down uncertainty of LOS at hospitals, proposed a multi-tiered data mining approach in which four different data training sets were created out of which three were formed by applying different clustering approaches and one by non-clustering that were analyzed by ten different classification algorithms. Each training set is processed by each classification algorithm forming about forty models. These models were compared based on performance measures of classification algorithms. The performance measures used were accuracy, recall, area under curve kappa statistics and precision. For ranking the models Friedman test was conducted, which concluded that Support Vector Machine and Bnet generated the better predictions. This proposal couldn’t give appropriate predictions for outliers and weakly performed for clusters of dynamic shapes and densities. Also, due to presence of anomalies tuples prediction was not truly efficient. Panchami V U [1] proposed a model in which LOS was predicted such that LOS longer than seven days. The dataset used was statistical data from the hospitals. By using Paper ID: OCT141207 1936