Computational Statistics & Data Analysis 51 (2007) 6180 – 6196 www.elsevier.com/locate/csda On time series model selection involving many candidate ARMA models Guoqi Qian a , , Xindong Zhao b a Department of Mathematics and Statistics, The University of Melbourne, VIC 3010, Australia b Department of Mathematics and Statistical Science, La Trobe University,VIC 3086, Australia Received 16 June 2006; received in revised form 21 December 2006; accepted 22 December 2006 Available online 3 January 2007 Abstract We study how to perform model selection for time series data where millions of candidate ARMA models may be eligible for selection. We propose a feasible computing method based on the Gibbs sampler. By this method model selection is performed through a random sample generation algorithm, and given a model of fixed dimension the parameter estimation is done through the maximum likelihood method. Our method takes into account several computing difficulties encountered in estimating ARMA models. The method is found to have probability of 1 in the limit in selecting the best candidate model under some regularity conditions. We then propose several empirical rules to implement our computing method for applications. Finally, a simulation study and an example on modelling China’s Consumer Price Index (CPI) data are presented for purpose of illustration and verification. © 2007 Elsevier B.V. All rights reserved. Keywords: Autoregressive-moving average (ARMA) models; Gibbs sampler and time series model selection 1. Introduction Autoregressive-moving average (ARMA) processes are often used for modelling stationary time series. How to select an appropriate ARMA model for an observed time series is an indispensable and integrated part of statistical data analysis of ARMA processes. Many ARMA model selection procedures and methods are available in literature and practice which either prelimi- narily identify candidate ARMA models or formally search for the best ones. These include, among others, the graphic methods based on the autocorrelation and partial autocorrelation functions (ACF and PACF in Box et al., 1994) and the information-theoretic criteria such as AIC (Akaike, 1973,1974), AICC (Hurvich and Tsai, 1989), BIC (Schwarz, 1978; Rissanen, 1978) and HQC (Hannan and Quinn, 1979). The aim of this paper is not to add yet another ARMA model selection criterion to the rich literature in this area. Rather we focus on a computational issue in ARMA model selection that is largely ignored in literature but needs to be addressed when there are very many candidate models available for selection. Specifically, there are situations where people want to use an ARMA(p,q) model with some of the p autoregressive and q moving average coefficients being possibly constrained to zero. If p P and q Q are known a priori, there are potentially 2 P +Q candidate ARMA Corresponding author. Tel.: +61 3 8344 4899; fax: +61 3 8344 4599. E-mail addresses: g.qian@ms.unimelb.edu.au (G. Qian), x.zhao@latrobe.edu.au (X. Zhao). 0167-9473/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2006.12.044