Oscar Luaces ⋆ , Jos´ e R. Quevedo ⋆ , Francisco Taboada † , Guillermo M. Albaiceta † , Antonio Bahamonde ⋆ ⋆ Artiﬁcial Intelligence Center † Hospital Univ. Central de Asturias (HUCA) University of Oviedo at Gij´ on University of Oviedo Asturias - Spain Asturias - Spain Abstract The paper presents a support vector method for es- timating probabilities in a real world problem: the prediction of probability of survival in critically ill patients. The standard procedure with Support Vec- tors Machines uses Platt’s method to ﬁt a sigmoid that transforms continuous outputs into probabili- ties. The method proposed here exploits the differ- ence between maximizing the AUC and minimiz- ing the error rate in binary classiﬁcation tasks. The conclusion is that it is preferable to optimize the AUC ﬁrst (using a multivariate SVM) to then ﬁt a sigmoid. We provide experimental evidence in fa- vor of our proposal. For this purpose, we used data collected in general ICUs at 10 hospitals in Spain; 6 of these include coronary patients, while the other 4 do not treat coronary diseases. The total number of patients considered in our study was 2501. 1 Introduction The available models for predicting outcomes in intensive care units (ICU) are usually scoring systems that estimate the probability of hospital mortality of critically ill adults. This is the case of APACHE (Acute Physiology And Chronic Health Evaluation) [Knaus et al., 1991], SAPS (Simpliﬁed Acute Physiology Score) [Le Gall et al., 1984], and MPM (Mortality Probability Models) [Lemeshow et al., 1993]. The score functions of these predictors were induced from data on thousands of patients using logistic regression. The data re- quired by these systems come from monitoring devices, clin- ical analysis, and demographic and diagnostic features of pa- tients. So, APACHE III includes age, 16 acute physiologic variables that use the worst value from the ﬁrst 24 hours in the ICU (temperature, heart rate, blood pressure, respira- tory rate, oxygenation, acid-base status, serum sodium, serum blood urea nitrogen, serum creatinine, serum albumin, serum bilirrubin, serum glucose, white cell count, hematocrit, item- ized Glasgow Coma Scale score, and urine output), preexist- * The research reported here is supported in part under grant TIN2005-08288 from the MEC (Ministerio de Educaci´ on y Ciencia of Spain). The authors acknowledge the work of the Grecia Group in the collection of data. ing functional limitations, major comorbidities, and treatment location immediately prior to ICU admission. These prognostic models are mainly used to measure the efﬁciency of ICU treatments. The risk stratiﬁcation of pa- tients allows comparison of the observed outcomes versus accepted standards provided by score functions. ICU assess- ment is very important since it is estimated that end-of-life care consumes 10% to 12% of all healthcare costs. More- over, in 2001 the average daily cost per patient in ICUs was about $3000 in the USA [Provonost and Angus, 2001]. On the other hand, the literature also shows that prognoses have constituted an important dimension of critical care, as patients and their families seek predictions about the duration and out- come of illness [Lemeshow et al., 1993]. In this paper we propose a new method for learning proba- bilities that will be tested on the probabilities of survival in ICU patients. The method makes intensive use of the so- called Support Vector Machines (SVM), a powerful family of algorithms for learning classiﬁcation and regression tasks. When used for binary classiﬁcation, SVM learn hypotheses that return continuous numbers: positive values for cases of one class, and negative for the other class. On the other hand, to measure the performance of predic- tions in medicine, and in general when classes are very un- balanced, the misclassiﬁcation rate (or accuracy) is usually inadequate. Frequently, the Area Under a receiver operat- ing characteristic (ROC) Curve (AUC for short) is used. This amount can be interpreted as the degree of coherence between a continuous output (such as the probability, or the continuous output of an SVM) and a binary classiﬁcation. It is important to emphasize that that coherence is established in terms of or- derings. For this purpose, continuous outputs or scores are used to rank available cases, while classes in the ICU prob- lem are codiﬁed by ‘+1’ when the patient has survived, and ‘−1’ otherwise. In this context, Hanley and McNeil [1982] showed that the AUC is the probability of a correct ranking; in other words, it is the probability that a randomly chosen subject of class ‘+1’ is (correctly) ranked with greater output than a randomly chosen subject of class ‘−1’. Therefore, AUC coincides with the value of the Wilcoxon-Mann-Whitney statistic. Additionally, there are other measures of the goodness of probability estimations; for instance, the Brier score is the av- erage of quadratic deviations of true and predicted probabili- IJCAI-07 956 Prediction of Probability of Survival in Critically Ill Patients Optimizing the Area Under the ROC Curve ∗