Research Article
Comparing the Forecast Performance of Advanced Statistical and
Machine Learning Techniques Using Huge Big Data:
Evidence from Monte Carlo Experiments
Faridoon Khan,
1
Amena Urooj,
1
Saud Ahmed Khan,
1
Abdelaziz Alsubie,
2
Zahra Almaspoor ,
3
and Sara Muhammadullah
1
1
PIDE School of Economics, Pakistan Institute of Development Economics, Islamabad, Pakistan
2
Department of Basic Sciences, College of Science and eoretical Studies, Saudi Electronic University, Riyadh, Saudi Arabia
3
Department of Statistics, Yazd University, Yazd 89175-741, Iran
CorrespondenceshouldbeaddressedtoZahraAlmaspoor;z.almaspoor@stu.yazd.ac.ir
Received 12 October 2021; Revised 17 November 2021; Accepted 30 November 2021; Published 14 December 2021
AcademicEditor:PauloJorgeSilveiraFerreira
Copyright©2021FaridoonKhanetal.isisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
is research compares factor models based on principal component analysis (PCA) and partial least squares (PLS) with
Autometrics, elastic smoothly clipped absolute deviation (E-SCAD), and minimax concave penalty (MCP) under different
simulated schemes like multicollinearity, heteroscedasticity, and autocorrelation. e comparison is made with varying sample
sizeandcovariates.Wefoundthatinthepresenceoflowandmoderatemulticollinearity,MCPoftenproducessuperiorforecasts
incontrasttosmallsamplecase,whereasE-SCADremainsbetter.Inthecaseofhighmulticollinearity,thePLS-basedfactormodel
remained dominant, but asymptotically the prediction accuracy of E-SCAD significantly enhances compared to other methods.
Under heteroscedasticity, MCP performs very well and most of the time beats the rival methods. In some circumstances under
largesamples,AutometricsprovidesasimilarforecastasMCP.Inthepresenceoflowandmoderateautocorrelation,MCPshows
outstandingforecastingperformanceexceptforthesmallsamplecase,whereasE-SCADproducesaremarkableforecast.Inthe
caseofextremeautocorrelation,E-SCADoutperformstherivaltechniquesunderboththesmallandmediumsamples,butfurther
augmentationinsamplesizeenablesMCPforecastmoreaccuratecomparatively.Tocomparethepredictiveabilityofallmethods,
wesplitthedataintotwohalves(i.e.,dataover1973–2007astrainingdataanddataover2008–2020astestingdata).Basedonthe
root mean square error and mean absolute error, the PLS-based factor model outperforms the competitor models in terms of
forecasting performance.
1. Introduction
epredictionofmacroeconomicvariablesisveryimportant
undermacroeconomicstudies,monetarypolicyanalysis,and
environmental economics. Accurate forecasts induce sound
insights into mechanisms of dynamic economies [1], more
effective monetary policies [2], and better portfolio man-
agement and hedging strategies [3]. In the data-rich envi-
ronmentexistingthesedays,manymacroeconomicseriesare
tracked by economists and decision-makers.
Low-dimensional models often include some pre-
specified economic covariates for instance vector
autoregression and therefore have a complication in cap-
turing the dynamic and complex patterns, which contain
huge panels of time series [4]. It is a fact that missing im-
portant variable(s) leads to an underspecified model, in-
ducing biased results. ere is an intense need to propose
updatedstatisticalmodelsandanalysisframeworkswiththe
purposeofexpandingthelow-dimensionalcounterpartsfor
improved forecasts. us, in the recent era, the analysis of
“BigData”hasbecomethecoreofeconomicsresearch.is
in turn has resulted in special attention being paid to the
hugeclassoftechniquesthatareavailableinthedomainof
machine learning, dimension reduction, and penalized
Hindawi
Complexity
Volume 2021, Article ID 6117513, 11 pages
https://doi.org/10.1155/2021/6117513