American Journal of Applied Mathematics and Statistics, 2014, Vol. 2, No. 3, 150-156
Available online at http://pubs.sciepub.com/ajams/2/3/9
© Science and Education Publishing
DOI:10.12691/ajams-2-3-9
On Optimal Weighting Scheme in Model Averaging
Georges Nguefack-Tsague
*
Department of Public Health, University of Yaounde I, Biostatistics Unit, Yaoundé, Cameroon
*Corresponding author: nguefacktsague@yahoo.fr
Received April 14, 2014; Revised May 07, 2014; Accepted May 13, 2014
Abstract Model averaging is an alternative to model selection and involves assigning weights to different models.
A natural question that arises is whether there is an optimal weighting scheme. Various authors have shown their
existence in others methodological frameworks. This paper investigates the derivation of optimal weights for model
averaging using square error loss. It is shown that though these weights may exist in theory and depend on model
parameters; once estimated they are no longer optimal. It is demonstrated using an example of linear regression that
model averaging estimators with these estimated weights are unlikely to outperform post-model selection and others
model averaging estimators. We provide a theoretical justification for this phenomenon.
Keywords: model averaging, model selection, optimal weight, square error loss, model uncertainty
Cite This Article: Georges Nguefack-Tsague, “On Optimal Weighting Scheme in Model Averaging.”
American Journal of Applied Mathematics and Statistics, vol. 2, no. 3 (2014): 150-156. doi: 10.12691/ajams-2-3-9.
1. Introduction
In most statistical modeling applications, there are
several models that are a priori plausible. It is quite
common nowadays to apply some model selection
procedure to select a single one. Overviews, explanations,
discussion and examples of such methods may be found in
the books by Linhart and Zucchini [1], McQuarrie and
Tsai [2], Burnham and Anderson [3] and Claeskens and
Hjort [4].
An alternative to select a single model for estimation
purposes is to give weights to all plausible models and to
work with the resulting weighted estimator. This leads to
the class of model averaging estimators. Once decided
upon the weights (these can be the result of a model
selection criterion such as Akaike’s information criterion
(AIC), or arising from Bayesian motivations), the problem
is not so much with the construction of the estimator, as
with its properties.
Since model selection corresponds to the special case of
assigning weight one to the selected model and weight
zero to all other considered models, the question is equally
relevant for estimators obtained after model selection. We
refer to these estimators as post-model selection
estimators (PMSE). The fact that selection was data-based
is often ignored in the subsequent analysis and leads to
invalid inferences. Literature on this topic includes but is
not limited to Bancroft [5] for pre-test estimators, Breiman
[6], Hjorth [7], Chatfield [8], Draper [9], Buckland et al.
[10], Zucchini [11], Candolo et al. [12], Hjort and
Claeskens [13], Efron [14], Leeb and Pötscher [15],
Longford [17], Claeskens and Hjort [4], Schomaker et al.
[18], Zucchini et al. [19], Liu and Yang [20], Nguefack-
Tsague and Zucchini [21], Nguefack-Tsague et al. [26],
and Nguefack-Tsague [22,23,24,25]. Bayesian model
averaging can be found in Hoeting et al. [27] and
Wasserman [28]. Wang et al. [29] provide a review of
frequentist model averaging estimators.
Many optimal weighting schemes have evolved
recently for model averaging. Hansen [30] discusses the
model averaging in least squares estimation and proposes
a method that selects the weights by minimizing Mallows’
criterion. Furthermore, Hansen [31] suggests to use
Mallows’ model averaging method to do forecast and
shows that the Mallows’ criterion is an asymptotically
unbiased estimator of both the in-sample mean squared
error and the out-of-sample one-step-ahead mean squared
forecast error. Hansen [32] studies least squares estimation
of an autoregressive model with a root close to unity by
proposing two measures to evaluate the efficiency of the
estimators: the asymptotic mean squared error and
forecast expected squared error. Numerical comparison of
Mallows’ model averaging method with many other
methods shows that Mallows’ model averaging estimator
often has smaller risk. Hansen [33] applies the same idea
for model averaging with autoregressions with a near unit
root. Since Hansen [30] assumes that the models are
nested and the weights are discrete, Wan et al. [34]
relaxed these two assumptions to obtain other versions of
model averaging by minimizing Mallows criterion. Their
proofs are based on Li [35]. Liang et al. [36] develop a
model weighting mechanism that involves minimizing the
trace of an unbiased estimator of the model average
estimator’s MSE. Hansen and Racine [37] propose to
select the weights of least squares model averaging
estimator by minimizing a deleted-1 cross-validation
criterion (the jackknife model averaging (JMA)). The
solutions of the above methods are obtained by quadratic
programming. Zhang et al. [38] propose a model
averaging scheme for linear mixed-effects models and
prove their method to be asymptotically optimal under
some regularity conditions.