Artificial Intelligence in Medicine 72 (2016) 72–82 Contents lists available at ScienceDirect Artificial Intelligence in Medicine j o ur nal ho me page: www.elsevier.com/lo cate/aiim A mixed-ensemble model for hospital readmission Lior Turgeman , Jerrold H. May Katz Graduate School of Business, University of Pittsburgh, Pittsburgh, PA 15260, United States a r t i c l e i n f o Article history: Received 2 May 2016 Received in revised form 20 July 2016 Accepted 30 August 2016 Keywords: Decision trees Support vector machine (SVM) Ensemble learning Imbalanced data set Decision function Error reduction Hospital readmission a b s t r a c t Objective: A hospital readmission is defined as an admission to a hospital within a certain time frame, typically thirty days, following a previous discharge, either to the same or to a different hospital. Because most patients are not readmitted, the readmission classification problem is highly imbalanced. Materials and methods: We developed a hospital readmission predictive model, which enables controlling the tradeoff between reasoning transparency and predictive accuracy, by taking into account the unique characteristics of the learned database. A boosted C5.0 tree, as the base classifier, was ensembled with a support vector machine (SVM), as a secondary classifier. The models were induced and validated using anonymized administrative records of 20,321 inpatient admissions, of 4840 Congestive Heart Failure (CHF) patients, at the Veterans Health Administration (VHA) hospitals in Pittsburgh, from fiscal years (FY) 2006 through 2014. Results: The SVM predictions are characterized by greater sensitivity values (true positive rates) than are the C5.0 predictions, for a wider range of cut off values of the ROC curve, depending on a predefined confidence threshold for the base C5.0 classifier. The total accuracy for the ensemble ranges from 81% to 85%. Different predictors, including comorbidities, lab values, and vitals, play different roles in the two models. Conclusions: The mixed-ensemble model enables easy and fast exploratory knowledge discovery of the database, and a control of the classification error for positive readmission instances. Implementation of this ensembling method for predicting all-cause hospital readmissions of CHF patients allows overcoming some of the limitations of the classifiers considered individually, and of other traditional ensembling methods. It also increases the classification accuracy for positive readmission instances, particularly when strong predictors are not available. © 2016 Elsevier B.V. All rights reserved. 1. Introduction A hospital readmission is defined as an admission to a hos- pital within a certain time frame, following an original hospital discharge, either to the same or to a different hospital. The Con- gestive Heart Failure (CHF) diagnosis includes some of the highest percentages of patients who are readmitted to a hospital within thirty days of discharge [1–4], and is the leading cause of hospi- tal admissions among patients over the age of 65 years [5]. CHF is also associated with high rates of mortality and morbidity [6]. Several previous papers used logistic regression to estimate the probability of hospital readmissions [7–10]. Another type of base- line model uses survival analysis (or hazard models) to estimate the time duration between consecutive patient admissions [11,12]. Although both approaches are useful in identifying readmission risk Corresponding author. E-mail address: Tur.lior@gmail.com (L. Turgeman). factors, they are not as useful for dealing with the non-stationary nature of patient readmissions, where the readmission propensity might change over time, depending on different conditions and treatments during prior admissions [13]. Also, most approaches are characterized by limited classification power when a large range of variables is considered. A variety of reasons could lead to readmissions, such as early discharge of patients, improper discharge planning, and poor care transition [14–16]. Vinson et al. [4] found that the factors pre- dictive of readmission of CHF patients included a prior history of heart failure, four or more admissions within the preceding eight years, and heart failure precipitated by an acute myocardial infarc- tion or uncontrolled hypertension. Using subjective criteria, they indicated that factors contributing to preventable readmissions included non-compliance with medications or diet, inadequate discharge planning or follow-up, a failed social support system, and failure to seek medical attention promptly when symptoms recurred. Schwartz et al. [17] studied the severity of cardiac illness, cognitive functioning, and functional health of 156 patients within http://dx.doi.org/10.1016/j.artmed.2016.08.005 0933-3657/© 2016 Elsevier B.V. All rights reserved.