Vol.9 (2019) No. 5 ISSN: 2088-5334 A Comparison of Supervised Learning Techniques for Predicting the Mortality of Patients with Altered State of Consciousness Muhammad Ariff Yasri #1 , Shamimi A. Halim #2 , Muthukkaruppan Annamalai #3 # Faculty Computer and Mathematical Science, Universiti Teknologi MARA, Shah Alam, Selangor, 40450, Malaysia E-mail: 1 ariffyasri@yahoo.com; 2 shamimi@tmsk.uitm.edu.my; 3 mk@tmsk.uitm.edu.my AbstractThe study attempts to identify a potentially reliable supervised learning technique for predicting the outcomes of mortality in an altered state of consciousness (ASC) patients. ASC is a state distinguished from ordinary waking consciousness, which is a common phenomenon in the Emergency Department (ED). Thirty (30) distinctive attributes or features are commonly used to recognize ASC. The study accordingly applied these features to model the prediction of mortality in ASC patients. Supervised learning techniques are found to be suitable for such classification problems. Consequently, the study compared five supervised learning techniques that are commonly applied to evaluate the risk of mortality using health-related datasets, namely Decision Tree, Neural Network, Random Forest, Naïve Bayes, and Logistic Regression. The labeled dataset comprised patient records captured by the Universiti Sains Malaysia hospital’s Emergency Medicine department from June to November 2008. The cleaned dataset was divided into two parts. The larger part was used for training and the smaller part, for evaluation. Since the ratio between training and testing samples varies between individual supervised learning techniques, we studied the performance of the modeled techniques by also varying the proportion of the training data to the dataset. We applied four percentage splits; 66%, 75%, 80%, and 90% to allow for 3-, 4-, 5- and 10-fold cross-validation experiments to evaluate the accuracy of the analyzed techniques. The variation helped to lessen the chance of over fitting, and averaged the effects of various conditions on accuracy. The experiments were conducted in the WEKA environment. The results indicated that Random Forest is the most reliable technique to model for predicting the mortality in ASC patients with acceptable accuracy, sensitivity, and specificity of 70.9%, 76.3%, and 65.5%, respectively. The results are further confirmed by SROC analysis. The findings of the study serve as a fundamental step towards a comprehensive study in the future. Keywordssupervised learning technique; predictive modelling; mortality; altered state of consciousness. I. INTRODUCTION An altered state of consciousness (ASC) is a common emergency case in the emergency department, and it is associated with significant mortality. The exact etiology of ASC is unknown at the clinical point of care. Later, a reliable prognosis is difficult to predict. On the other hand, surgical, medical, and ethical decisions depend upon this information. While it is legitimate to set up optimum medical and therapeutic cares and good prognosis for patients, it may not be desirable for medical teams to promote such treatments when the predictable prognosis is poor. A better understanding of patients’ outcomes would help in decisions related to rehabilitation, acute or end-of- life care to reduce the in-hospital death risk. Quick and accurate prediction of mortality for patients with ASC is essential to ensure immediate appropriate actions or interventions in emergency departments. Prediction systems that can learn from collected data have the potential to offer rapid and reliable prognostic information for medical teams’ decision-making. Machine learning allows computers to learn and analyze the pattern without explicitly being programmed [1]. Two main types of machine learning techniques are supervised and unsupervised. Supervised learning technique can be regarded as a learning function that maps an input to an output based on the labeled training dataset. The training set (input-output pairs) can be extracted from existing electronic medical records. On the other hand, unsupervised learning look toward unlabeled data and tries to learn the patterns in the data without any training. When labeled dataset is available, supervised learning techniques are applied because they make it possible to test the predictive model. Moreover, supervised learning techniques are suitable for classification problems such as ours, and they are said to generate reasonably accurate predictions for new data [2]. The challenge is to identify a reliably supervised learning technique for a problem because no one method is a good fit for every application problem. The application problem of this study is to predict the outcomes of mortality in ASC patients – a classification problem. 1488