Citation: Hardy, N. “A Bias
Recognized Is a Bias Sterilized”: The
Effects of a Bias in Forecast
Evaluation. Mathematics 2022, 10, 171.
https://doi.org/10.3390/
math10020171
Academic Editors: Javier Perote,
Andrés Mora-Valencia and
Trino-Manuel Ñíguez
Received: 16 November 2021
Accepted: 1 January 2022
Published: 6 January 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
mathematics
Article
“A Bias Recognized Is a Bias Sterilized”: The Effects of a Bias in
Forecast Evaluation
Nicolas Hardy
1,2
1
Facultad de Economía y Empresa, Universidad Diego Portales, Santiago 8370179, Chile; nicolas.hardy@udp.cl
2
School of Business, Universidad Adolfo Ibáñez, Santiago 8380629, Chile
Abstract: Are traditional tests of forecast evaluation well behaved when the competing (nested)
model is biased? No, they are not. In this paper, we show analytically and via simulations that, under
the null hypothesis of no encompassing, a bias in the nested model may severely distort the size
properties of traditional out-of-sample tests in economic forecasting. Not surprisingly, these size
distortions depend on the magnitude of the bias and the persistency of the additional predictors. We
consider two different cases: (i) There is both in-sample and out-of-sample bias in the nested model.
(ii) The bias is present exclusively out-of-sample. To address the former case, we propose a modified
encompassing test (MENC-NEW) robust to a bias in the null model. Akin to the ENC-NEW statistic,
the asymptotic distribution of our test is a functional of stochastic integrals of quadratic Brownian
motions. While this distribution is not pivotal, we can easily estimate the nuisance parameters. To
address the second case, we derive the new asymptotic distribution of the ENC-NEW, showing that
critical values may differ remarkably. Our Monte Carlo simulations reveal that the MENC-NEW (and
the ENC-NEW with adjusted critical values) is reasonably well-sized even when the ENC-NEW (with
standard critical values) exhibits rejections rates three times higher than the nominal size.
Keywords: forecasting; random walk; out-of-sample; bias; prediction; mean square prediction error
1. Introduction
“Fortunately for serious minds, a bias recognized is a bias sterilized” Benjamin Haydon.
Diebold and Mariano (1995) [1] and West’s (1996) [2] seminal papers are typically
pinpointed as the Big Bang of the forecast evaluation literature in economics and finance.
Even though both papers propose asymptotically normal tests for forecast evaluation, the
proper environment of each paper is different. The former considers the case of comparing
forecasts (i.e., the forecasts are assumed to be given), while the latter focuses on the case
of comparing models (i.e., the forecasts are constructed through estimated parametric
models). Put simply, the key contribution of West’s asymptotic theory is that it accounts for
parameter uncertainty.
While the asymptotic theory of [2] is quite general and allows a variety of estimation
techniques and loss functions, it is not universal. One of the key assumptions in West’s
theory is a full rank condition over the long-run variance of the loss function when parame-
ters are set at their true values. One of the most iconic cases in which this condition is not
fulfilled is the comparison of two competing nested models: Under the null hypothesis
of no encompassing, both models are identical, and standard tests of forecast evaluation
become degenerate. As pointed out by Clark and West (2006) [3] and West (2006) [4], for
the case of MSPE, this degeneracy is not only important in a theoretical sense, but also in
an empirical one: “[ ... ] use of standard critical values usually results in very poorly sized tests,
with far too few rejections. As well, the usual statistic has very poor power.”[4] p. 119 (A note of
caution here. We are not arguing that this degeneracy necessarily imply that tests become
undersize. Some of the simulations in [5] suggest both types of size distortions: sometimes
tests are undersized, sometimes they are oversized.).
Mathematics 2022, 10, 171. https://doi.org/10.3390/math10020171 https://www.mdpi.com/journal/mathematics