Computational Statistics and Data Analysis 56 (2012) 1256–1274 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Bayesian model selection for logistic regression models with random intercept Helga Wagner , Christine Duller Department of Applied Statistics and Econometrics, Johannes Kepler Universität Altenbergerstrasse 69, 4040 Linz, Austria article info Article history: Available online 6 July 2011 Keywords: Variable selection Variance selection MCMC Auxiliary mixture sampling Normal scale mixtures Spike and slab priors abstract Data, collected to model risk of an interesting event, often have a multilevel structure as patients are clustered within larger units, e.g. clinical centers. Risk of the event is usually modeled using a logistic regression model, with a random intercept to control for heterogeneity among clusters. Model specification requires to decide which regressors have a non-negligible effect, and hence, should be included in the final model and whether risk is actually heterogeneous among centers, i.e. whether the model should include a random intercept or not. In a Bayesian approach, these questions can be answered by combining variable selection with variance selection of the random intercept. Bayesian model selection is performed for a reparameterized version of the logistic random intercept model using spike and slab priors on the parameters subject to selection. Different specifications for these priors are compared on simulated data as well as on a data set where the goal is to identify risk factors for complications after endoscopic retrograde cholangiopancreatography (ERCP). © 2011 Elsevier B.V. All rights reserved. 1. Introduction Medical studies often are carried out with the goal to identify factors affecting risk of a disease or an adverse treatment effect. In this paper, we analyze data from an Austrian benchmarking project where information on routinely applied endoscopic retrograde cholangiopancreatographies was collected in 29 Austrian centers in 2006 and 2007. ERCP is an X- ray examination of pancreatic and bile ducts using a contrast medium and a special kind of endoscope. Moreover, ERCP is often used for medical treatments, e.g. for removal of gallstones. The procedure entails risk of several complications, e.g. post-ERCP pancreatitis, cholangitis, perforation, bleeding, and very rarely even procedure-related death (Kapral et al., 2008). The aim of our analysis was identification of patient- and procedure-related risk factors for bleeding after routinely applied ERCPs. A further issue was to assess whether risk is homogeneous among clinical centers after controlling for covariates. We model the risk of bleeding after ERCP using a logistic model with a large set of potential risk factors as covariates and include a random intercept to account for clustering of observations within centers. The data set comprises data on 3143 patients and contains one continuous covariate (age) and 36 binary covariates, which indicate the presence or absence of a factor considered to affect risk. Risk of bleeding is small as it occurs only for 118 (3.8%) patients. Inference for this data set is challenging as incidence is rare for some of the potential risk factors, entailing rare joint incidence of bleeding and these risk factors. Even though the sample size is not particularly small, no case of bleeding was observed in the following groups of patients: patients with previous gastric surgery, patients whose indication for ERCP was pancreatic duct stone, and patients for whom the oxygenation of hemoglobin was controlled by pulse oximetry. The code used in this paper and a file describing its usage can be found as a supplementary material of the electronic version of the paper. Corresponding author. Tel.: +43 2468 5883; fax: +43 2468 9846. E-mail addresses: helga.wagner@jku.at (H. Wagner), christine.duller@jku.at (C. Duller). 0167-9473/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2011.06.033