Neural Networks 71 (2015) 204–213 Contents lists available at ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet Prediction of telephone calls load using Echo State Network with exogenous variables Filippo Maria Bianchi a,∗ , Simone Scardapane a , Aurelio Uncini a , Antonello Rizzi a , Alireza Sadeghian b a Department of Information Engineering, Electronics and Telecommunications (DIET), ‘‘Sapienza’’ University of Rome, Via Eudossiana 18, 00184 Rome, Italy b Department of Computer Science, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada article info Article history: Received 5 June 2015 Received in revised form 23 July 2015 Accepted 28 August 2015 Available online 7 September 2015 Keywords: Time-series Forecasting Echo State Networks Exogenous variables Genetic algorithm Call data records abstract We approach the problem of forecasting the load of incoming calls in a cell of a mobile network using Echo State Networks. With respect to previous approaches to the problem, we consider the inclusion of additional telephone records regarding the activity registered in the cell as exogenous variables, by investigating their usefulness in the forecasting task. Additionally, we analyze different methodologies for training the readout of the network, including two novel variants, namely ν -SVR and an elastic net penalty. Finally, we employ a genetic algorithm for both the tasks of tuning the parameters of the system and for selecting the optimal subset of most informative additional time-series to be considered as external inputs in the forecasting problem. We compare the performances with standard prediction models and we evaluate the results according to the specific properties of the considered time-series. © 2015 Elsevier Ltd. All rights reserved. 1. Introduction Time-Series Forecasting (TSF) refers to the problem of predict- ing future values of a time-series (TS), starting from a previously observed history (De Gooijer & Hyndman, 2006). In this paper, we are concerned specifically with the TSF problem of telephone activ- ity loads. This is closely related to the forecasting of workload in call centers (Aksin, Armony, & Mehrotra, 2007) where, usually, only the TS containing the load of incoming calls is taken into account and the other external variables considered for the prediction usually possess a very different nature (e.g. advertisement, catalogs, calen- dar effects Andrews & Cunningham, 1995; Antipov & Meade, 2002; Soyer & Tarimcilar, 2008). An accurate Short-Term Load Forecast (STLF) method would save operating costs, keep power markets efficient and provide a better understanding of the dynamics of the observed system. On the other hand, a wrong prediction could cause either a load overestimation, which leads to the excess of reserving resources and consequently more costs and contract ∗ Corresponding author. Tel.: +39 06 44585495; fax: +39 06 4873300. E-mail addresses: filippomaria.bianchi@uniroma1.it (F.M. Bianchi), simone.scardapane@uniroma1.it (S. Scardapane), aurelio.uncini@uniroma1.it (A. Uncini), antonello.rizzi@uniroma1.it (A. Rizzi), asadeghi@ryerson.ca (A. Sadeghian). curtailments for market participants, or a load underestimation resulting in failures in providing enough reserves, thereby more costly ancillary services (Bunn, 2000; Ruiz & Gross, 2008). Specifically, in this work we treat the problem of STFL relative to the telephonic activities registered on a cell covered by an antenna of a mobile phone network. Relatively to each cell there are different kinds of data that describes the volume and the number of both outgoing and incoming calls, from which we generate different TSs. Our work is focused on forecasting the values of a specific TS using past measurements and leveraging on the information contained in the remaining TSs, considered as exogenous variables which are presented as input to the system along with the TS that must be predicted. In particular, in this work we consider call records collected in the Orange telephone dataset published for the ‘‘Data for Development’’ (D4D) challenge (Blondel et al., 2012). More information on the TSs and how they are generated in a pre-processing phase is provided in Section 3. As forecast method we use a standard Echo State Network (ESN) (Butcher, Verstraeten, Schrauwen, Day, & Haycock, 2013; Jaeger & Haas, 2004; Lukoševičius & Jaeger, 2009; Verstraeten, Schrauwen, d’Haene, & Stroobandt, 2007), which is a particular class of Recurrent Neural Network (RNN). The main peculiarity of ESNs is that the recurrent part of the network (the reservoir ) is considered fixed, and only a non-recurrent part (termed readout ) is http://dx.doi.org/10.1016/j.neunet.2015.08.010 0893-6080/© 2015 Elsevier Ltd. All rights reserved.