Citation: Panapakidis, I.; Katsivelakis, M.; Bargiotas, D. A Metaheuristics-Based Inputs Selection and Training Set Formation Method for Load Forecasting. Symmetry 2022, 14, 1733. https:// doi.org/10.3390/sym14081733 Academic Editor: Theodore E. Simos Received: 9 May 2022 Accepted: 13 June 2022 Published: 19 August 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). symmetry S S Article A Metaheuristics-Based Inputs Selection and Training Set Formation Method for Load Forecasting Ioannis Panapakidis *, Michail Katsivelakis and Dimitrios Bargiotas Department of Electrical and Computer Engineering, University of Thessaly, 38221 Volos, Greece * Correspondence: panapakidis@uth.gr; Tel.: +30-24-2107-4821 Abstract: Load forecasting is a procedure of fundamental importance in power systems operation and planning. Many entities can benefit from accurate load forecasting such as generation companies, systems operators, retailers, prosumers, and others. A variety of models have been proposed so far in the literature. Among them, artificial neural networks are a favorable approach mainly due to their potential for capturing the relationship between load and other parameters. The forecasting performance highly depends on the number and types of inputs. The present paper presents a particle swarm optimization (PSO) two-step method for increasing the performance of short- term load forecasting (STLF). During the first step, PSO is applied to derive the optimal types of inputs for a neural network. Next, PSO is applied again so that the available training data is split into homogeneous clusters. For each cluster, a different neural network is utilized. Experimental results verify the robustness of the proposed approach in a bus load forecasting problem. Also, the proposed algorithm is checked on a load profiling problem where it outperforms the most common algorithms of the load profiling-related literature. During input selection, the weights update is held in asymmetrical duration. The weights of the training phase require more time compared with the test phase. Keywords: clustering; load forecasting; metaheuristics; neural networks; particle swarm optimization 1. Introduction Load forecasting forms the pillar that power systems operation and planning rely on [1]. In day-ahead markets, the system operator provides the official forecasts so that the generation companies will prepare the energy/price bids for the wholesale markets [2]. In the long-term horizon, demand forecasting is vital for generation capacity expansion sce- narios in the energy sector [3,4]. Also, in competitive energy markets demand forecasting is vital for aggregators, retailers, and prosumers [57]. The importance of load forecasting is reflected in the large number of research studies, pilot programs, and relevant applica- tions [8]. In general terms, forecasting models can be classified into time series models, computational intelligence-based models, and hybrid ones [9]. In time series models, the structure of the model, i.e., a number of time lags, auto-regressive components and other parameters should be known in advance. A special effort should be made to derive the model’s structure by utilizing a set of statistical tests. Time series models include ARMA, ARIMA, GARCH, and others. Historically, they were the first to have been proposed in the literature [1012]. On the other hand, in computational intelligence-based models like neural networks, support vector machines, and others, there are no requirements for an a priori definition of the structure [1315]. The latter is derived from the training procedure. Hybrid models usually refer to the advancement of the previous categories, where a time series and computational-based model are combined, or a time series processing technique or other method is applied prior to the application of the main forecaster [1618]. Symmetry 2022, 14, 1733. https://doi.org/10.3390/sym14081733 https://www.mdpi.com/journal/symmetry