Computational Statistics and Data Analysis ( ) Contents lists available at ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Latent profile analysis with nonnormal mixtures: A Monte Carlo examination of model selection using fit indices Grant B. Morgan a, , Kari J. Hodge a,1 , Aaron R. Baggett b,1 a Department of Educational Psychology, Baylor University, One Bear Place #97301, Waco, TX, 76798-7301, USA b Department of Psychology, University of Mary Hardin-Baylor, Box 8014, Belton, TX, 76513-8014, USA article info Article history: Received 30 April 2014 Received in revised form 27 February 2015 Accepted 28 February 2015 Available online xxxx Keywords: Mixture model Model selection Nonnormal data abstract The performances of fit indices used for model selection in cross-sectional mixture modeling with nonnormally distributed indicators were examined in two studies using Monte Carlo methods. Simulation conditions were selected to mirror conditions found in educational and psychological research. The design factors under investigation were: indicator distribution, number of indicators, sample size, and profile prevalence. All models contained five, ten, or 15 continuous indicators with varying departures from normality. The fit indices examined were Akaike’s information criterion (AIC), corrected Akaike’s information criterion (AICc), consistent Akaike’s information criterion (CAIC), Bayesian information criterion (BIC), sample size-adjusted Bayesian information criterion (SSBIC), Draper’s information criterion (DIC), integrated classification likelihood criterion with Bayesian-type approximation (ICL), entropy, and the adjusted Lo–Mendell–Rubin likelihood ratio test (LMR). In the first study, nonnormally distributed data were used to estimate the mixture models. No fit index uniformly identified the simulated number of profiles using nonnormal indicators. The fit indices that tended to identify the simulated number of profiles more frequently than others were BIC, SSBIC, CAIC, and LMR although the condition(s) in which this was observed varied. In the second study, the raw data were transformed using van der Waerden quantile normal scores. Despite deflating the indicator variances, the use of normal scores increased the frequency with which fit indices identified the simulated number of profiles across most conditions. © 2015 Elsevier B.V. All rights reserved. 1. Introduction Classification procedures have been used for decades by researchers interested in classifying individual cases of a hetero- geneous dataset into homogeneous groups. During this time, classification methods have been applied in many disciplines, such as business, education, medicine, and the social sciences. Generally, classification refers to the process of dividing a large, heterogeneous set of observations into smaller, homogeneous groups with smaller within-group variability and greater between-group variability (Clogg, 1995; Gordon, 1981; Heinen, 1996; Muthén and Muthén, 2000). The primary challenge facing researchers is that the frequency and form of the groups underlying a complex dataset is rarely known in advance. The frequency of the groups refers to the number and size of each group, and the form refers to the group-specific Corresponding author. Tel.: +1 254 710 7231; fax: +1 254 710 3265. E-mail addresses: grant_morgan@baylor.edu (G.B. Morgan), kari_hodge@baylor.edu (K.J. Hodge), abaggett@umhb.edu (A.R. Baggett). 1 There is a supplementary material comprising the tables that contain the frequency with which each fit index identified the competing component models and a sample of the Mplus and SAS code. http://dx.doi.org/10.1016/j.csda.2015.02.019 0167-9473/© 2015 Elsevier B.V. All rights reserved.