Computational Statistics and Data Analysis 56 (2012) 4399–4412 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda A Bayesian generalized multiple group IRT model with model-fit assessment tools Caio L.N. Azevedo a,∗ , Dalton F. Andrade b , Jean-Paul Fox c a Department of Statistics, University of Campinas, Brazil b Department of Informatics and Statistics, Federal University of Santa Catarina, Brazil c Department of Research Methodology, University of Twente, The Netherlands article info Article history: Received 2 June 2011 Received in revised form 23 March 2012 Accepted 24 March 2012 Available online 1 April 2012 Keywords: Multiple groups Gibbs sampling Posterior Predictive checking Bayesian residual analysis abstract The multiple group IRT model (MGM) proposed by Bock and Zimowski (1997) provides a useful framework for analyzing item response data from clustered respondents. In the MGM, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The main goal is to explore the potentials of an MCMC estimation procedure and Bayesian model-fit tools for the MGM. We develop a full Gibbs sampling algorithm (FGSA) for estimation as well as a Metropolis-Hastings within Gibss sampling algorithm (MHWGS) in order to use non-conjugate priors. The FGSA is compared with Bilog–MG, which uses marginal maximum likelihood (MML) and marginal maximum a posteriori (MMAP) methods. That is; Bilog–MG provides maximum likelihood (ML) and expected a posteriori (EAP) estimates for both item and population parameters, and maximum a posteriori (MAP) estimates for the latent traits. We conclude that, in general, the results from our approach are slightly better than Bilog–MG. Besides a simultaneous MCMC estimation procedure, model-fit assessment tools are developed. Furthermore, the prior sensitivity is investigated with respect to the parameters of the latent population distributions. It will be shown that the FGSA provides a wide set of model- fit tools. The proposed model assessment tools can evaluate important model assumptions of (1) the item response function (IRF) and (2) the latent trait distributions. The utility of the proposed estimation and model-fit assessment methods will be shown using data from a longitudinal data study concerning first to fourth graders of sampled Brazilian public schools. © 2012 Elsevier B.V. All rights reserved. 1. Introduction In educational assessment, clinical trials and bio essays among other fields, it is common to observe examinees (subjects) from different groups. The groups can be characterized by gender, grade, social level, and so on. The group heterogeneity can reflect different behaviors. Therefore, it is important to take such heterogeneity into account. Attention will be focused on applications where the number of groups is limited and/or there is a specific interest in the sampled groups. The population distribution representing the clustered respondents completely specifies the distribution of respondents in each group, and no assumptions will be made about groups that are not selected. Then, inferences can be made with respect to the sampled groups but not to some higher level of population of groups. Bock and Zimowski (1997) developed an IRT model where each group has a specific latent trait distribution. This multiple group model (MGM) has an additional set of parameters: multiple population parameters, which characterize the latent ∗ Corresponding author. Tel.: +55 19 35216060. E-mail address: cnaber@ime.unicamp.br (C.L.N. Azevedo). 0167-9473/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2012.03.017