Computational Statistics and Data Analysis 56 (2012) 1256–1274
Contents lists available at SciVerse ScienceDirect
Computational Statistics and Data Analysis
journal homepage: www.elsevier.com/locate/csda
Bayesian model selection for logistic regression models with
random intercept
✩
Helga Wagner
∗
, Christine Duller
Department of Applied Statistics and Econometrics, Johannes Kepler Universität Altenbergerstrasse 69, 4040 Linz, Austria
article info
Article history:
Available online 6 July 2011
Keywords:
Variable selection
Variance selection
MCMC
Auxiliary mixture sampling
Normal scale mixtures
Spike and slab priors
abstract
Data, collected to model risk of an interesting event, often have a multilevel structure
as patients are clustered within larger units, e.g. clinical centers. Risk of the event is
usually modeled using a logistic regression model, with a random intercept to control
for heterogeneity among clusters. Model specification requires to decide which regressors
have a non-negligible effect, and hence, should be included in the final model and whether
risk is actually heterogeneous among centers, i.e. whether the model should include a
random intercept or not. In a Bayesian approach, these questions can be answered by
combining variable selection with variance selection of the random intercept. Bayesian
model selection is performed for a reparameterized version of the logistic random intercept
model using spike and slab priors on the parameters subject to selection. Different
specifications for these priors are compared on simulated data as well as on a data set
where the goal is to identify risk factors for complications after endoscopic retrograde
cholangiopancreatography (ERCP).
© 2011 Elsevier B.V. All rights reserved.
1. Introduction
Medical studies often are carried out with the goal to identify factors affecting risk of a disease or an adverse treatment
effect. In this paper, we analyze data from an Austrian benchmarking project where information on routinely applied
endoscopic retrograde cholangiopancreatographies was collected in 29 Austrian centers in 2006 and 2007. ERCP is an X-
ray examination of pancreatic and bile ducts using a contrast medium and a special kind of endoscope. Moreover, ERCP
is often used for medical treatments, e.g. for removal of gallstones. The procedure entails risk of several complications, e.g.
post-ERCP pancreatitis, cholangitis, perforation, bleeding, and very rarely even procedure-related death (Kapral et al., 2008).
The aim of our analysis was identification of patient- and procedure-related risk factors for bleeding after routinely
applied ERCPs. A further issue was to assess whether risk is homogeneous among clinical centers after controlling for
covariates. We model the risk of bleeding after ERCP using a logistic model with a large set of potential risk factors as
covariates and include a random intercept to account for clustering of observations within centers. The data set comprises
data on 3143 patients and contains one continuous covariate (age) and 36 binary covariates, which indicate the presence or
absence of a factor considered to affect risk. Risk of bleeding is small as it occurs only for 118 (3.8%) patients.
Inference for this data set is challenging as incidence is rare for some of the potential risk factors, entailing rare joint
incidence of bleeding and these risk factors. Even though the sample size is not particularly small, no case of bleeding
was observed in the following groups of patients: patients with previous gastric surgery, patients whose indication for
ERCP was pancreatic duct stone, and patients for whom the oxygenation of hemoglobin was controlled by pulse oximetry.
✩
The code used in this paper and a file describing its usage can be found as a supplementary material of the electronic version of the paper.
∗
Corresponding author. Tel.: +43 2468 5883; fax: +43 2468 9846.
E-mail addresses: helga.wagner@jku.at (H. Wagner), christine.duller@jku.at (C. Duller).
0167-9473/$ – see front matter © 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.csda.2011.06.033