American Journal of Applied Mathematics and Statistics, 2014, Vol. 2, No. 3, 150-156 Available online at http://pubs.sciepub.com/ajams/2/3/9 © Science and Education Publishing DOI:10.12691/ajams-2-3-9 On Optimal Weighting Scheme in Model Averaging Georges Nguefack-Tsague * Department of Public Health, University of Yaounde I, Biostatistics Unit, Yaoundé, Cameroon *Corresponding author: nguefacktsague@yahoo.fr Received April 14, 2014; Revised May 07, 2014; Accepted May 13, 2014 Abstract Model averaging is an alternative to model selection and involves assigning weights to different models. A natural question that arises is whether there is an optimal weighting scheme. Various authors have shown their existence in others methodological frameworks. This paper investigates the derivation of optimal weights for model averaging using square error loss. It is shown that though these weights may exist in theory and depend on model parameters; once estimated they are no longer optimal. It is demonstrated using an example of linear regression that model averaging estimators with these estimated weights are unlikely to outperform post-model selection and others model averaging estimators. We provide a theoretical justification for this phenomenon. Keywords: model averaging, model selection, optimal weight, square error loss, model uncertainty Cite This Article: Georges Nguefack-Tsague, “On Optimal Weighting Scheme in Model Averaging.” American Journal of Applied Mathematics and Statistics, vol. 2, no. 3 (2014): 150-156. doi: 10.12691/ajams-2-3-9. 1. Introduction In most statistical modeling applications, there are several models that are a priori plausible. It is quite common nowadays to apply some model selection procedure to select a single one. Overviews, explanations, discussion and examples of such methods may be found in the books by Linhart and Zucchini [1], McQuarrie and Tsai [2], Burnham and Anderson [3] and Claeskens and Hjort [4]. An alternative to select a single model for estimation purposes is to give weights to all plausible models and to work with the resulting weighted estimator. This leads to the class of model averaging estimators. Once decided upon the weights (these can be the result of a model selection criterion such as Akaike’s information criterion (AIC), or arising from Bayesian motivations), the problem is not so much with the construction of the estimator, as with its properties. Since model selection corresponds to the special case of assigning weight one to the selected model and weight zero to all other considered models, the question is equally relevant for estimators obtained after model selection. We refer to these estimators as post-model selection estimators (PMSE). The fact that selection was data-based is often ignored in the subsequent analysis and leads to invalid inferences. Literature on this topic includes but is not limited to Bancroft [5] for pre-test estimators, Breiman [6], Hjorth [7], Chatfield [8], Draper [9], Buckland et al. [10], Zucchini [11], Candolo et al. [12], Hjort and Claeskens [13], Efron [14], Leeb and Pötscher [15], Longford [17], Claeskens and Hjort [4], Schomaker et al. [18], Zucchini et al. [19], Liu and Yang [20], Nguefack- Tsague and Zucchini [21], Nguefack-Tsague et al. [26], and Nguefack-Tsague [22,23,24,25]. Bayesian model averaging can be found in Hoeting et al. [27] and Wasserman [28]. Wang et al. [29] provide a review of frequentist model averaging estimators. Many optimal weighting schemes have evolved recently for model averaging. Hansen [30] discusses the model averaging in least squares estimation and proposes a method that selects the weights by minimizing Mallows’ criterion. Furthermore, Hansen [31] suggests to use Mallows’ model averaging method to do forecast and shows that the Mallows’ criterion is an asymptotically unbiased estimator of both the in-sample mean squared error and the out-of-sample one-step-ahead mean squared forecast error. Hansen [32] studies least squares estimation of an autoregressive model with a root close to unity by proposing two measures to evaluate the efficiency of the estimators: the asymptotic mean squared error and forecast expected squared error. Numerical comparison of Mallows’ model averaging method with many other methods shows that Mallows’ model averaging estimator often has smaller risk. Hansen [33] applies the same idea for model averaging with autoregressions with a near unit root. Since Hansen [30] assumes that the models are nested and the weights are discrete, Wan et al. [34] relaxed these two assumptions to obtain other versions of model averaging by minimizing Mallows criterion. Their proofs are based on Li [35]. Liang et al. [36] develop a model weighting mechanism that involves minimizing the trace of an unbiased estimator of the model average estimator’s MSE. Hansen and Racine [37] propose to select the weights of least squares model averaging estimator by minimizing a deleted-1 cross-validation criterion (the jackknife model averaging (JMA)). The solutions of the above methods are obtained by quadratic programming. Zhang et al. [38] propose a model averaging scheme for linear mixed-effects models and prove their method to be asymptotically optimal under some regularity conditions.