Artificial Intelligence in Medicine 72 (2016) 72–82
Contents lists available at ScienceDirect
Artificial Intelligence in Medicine
j o ur nal ho me page: www.elsevier.com/lo cate/aiim
A mixed-ensemble model for hospital readmission
Lior Turgeman
∗
, Jerrold H. May
Katz Graduate School of Business, University of Pittsburgh, Pittsburgh, PA 15260, United States
a r t i c l e i n f o
Article history:
Received 2 May 2016
Received in revised form 20 July 2016
Accepted 30 August 2016
Keywords:
Decision trees
Support vector machine (SVM)
Ensemble learning
Imbalanced data set
Decision function
Error reduction
Hospital readmission
a b s t r a c t
Objective: A hospital readmission is defined as an admission to a hospital within a certain time frame,
typically thirty days, following a previous discharge, either to the same or to a different hospital. Because
most patients are not readmitted, the readmission classification problem is highly imbalanced.
Materials and methods: We developed a hospital readmission predictive model, which enables controlling
the tradeoff between reasoning transparency and predictive accuracy, by taking into account the unique
characteristics of the learned database. A boosted C5.0 tree, as the base classifier, was ensembled with a
support vector machine (SVM), as a secondary classifier. The models were induced and validated using
anonymized administrative records of 20,321 inpatient admissions, of 4840 Congestive Heart Failure
(CHF) patients, at the Veterans Health Administration (VHA) hospitals in Pittsburgh, from fiscal years
(FY) 2006 through 2014.
Results: The SVM predictions are characterized by greater sensitivity values (true positive rates) than are
the C5.0 predictions, for a wider range of cut off values of the ROC curve, depending on a predefined
confidence threshold for the base C5.0 classifier. The total accuracy for the ensemble ranges from 81% to
85%. Different predictors, including comorbidities, lab values, and vitals, play different roles in the two
models.
Conclusions: The mixed-ensemble model enables easy and fast exploratory knowledge discovery of the
database, and a control of the classification error for positive readmission instances. Implementation of
this ensembling method for predicting all-cause hospital readmissions of CHF patients allows overcoming
some of the limitations of the classifiers considered individually, and of other traditional ensembling
methods. It also increases the classification accuracy for positive readmission instances, particularly when
strong predictors are not available.
© 2016 Elsevier B.V. All rights reserved.
1. Introduction
A hospital readmission is defined as an admission to a hos-
pital within a certain time frame, following an original hospital
discharge, either to the same or to a different hospital. The Con-
gestive Heart Failure (CHF) diagnosis includes some of the highest
percentages of patients who are readmitted to a hospital within
thirty days of discharge [1–4], and is the leading cause of hospi-
tal admissions among patients over the age of 65 years [5]. CHF
is also associated with high rates of mortality and morbidity [6].
Several previous papers used logistic regression to estimate the
probability of hospital readmissions [7–10]. Another type of base-
line model uses survival analysis (or hazard models) to estimate
the time duration between consecutive patient admissions [11,12].
Although both approaches are useful in identifying readmission risk
∗
Corresponding author.
E-mail address: Tur.lior@gmail.com (L. Turgeman).
factors, they are not as useful for dealing with the non-stationary
nature of patient readmissions, where the readmission propensity
might change over time, depending on different conditions and
treatments during prior admissions [13]. Also, most approaches are
characterized by limited classification power when a large range of
variables is considered.
A variety of reasons could lead to readmissions, such as early
discharge of patients, improper discharge planning, and poor care
transition [14–16]. Vinson et al. [4] found that the factors pre-
dictive of readmission of CHF patients included a prior history of
heart failure, four or more admissions within the preceding eight
years, and heart failure precipitated by an acute myocardial infarc-
tion or uncontrolled hypertension. Using subjective criteria, they
indicated that factors contributing to preventable readmissions
included non-compliance with medications or diet, inadequate
discharge planning or follow-up, a failed social support system,
and failure to seek medical attention promptly when symptoms
recurred. Schwartz et al. [17] studied the severity of cardiac illness,
cognitive functioning, and functional health of 156 patients within
http://dx.doi.org/10.1016/j.artmed.2016.08.005
0933-3657/© 2016 Elsevier B.V. All rights reserved.