Performance of Modeling Time Series Using
Nonlinear Autoregressive with eXogenous input
(NARX) in the Network Traffic Forecasting
Haviluddin
Faculty of Mathematics and Natural Science
Dept. of Computer Science, Universitas Mulawarman,
Indonesia
haviluddin@unmul.ac.id
Rayner Alfred
Faculty of Computing and Informatics,
Dept. of Computer Science, Universiti Malaysia Sabah,
Malaysia
ralfred@ums.edu.my
Abstract — A time-series data analysis and prediction tool for
learning the network traffic usage data is very important in order to
ensure an acceptable and a good quality of network services can be
provided to the organization (e.g., university). This paper presents the
modeling using a nonlinear autoregressive with eXogenous input (NARX)
algorithm for predicting network traffic datasets. The best performance
of NARX model, based on the architecture 189:31:94 or 60%:10%:30%,
with delay value of 5, is able to produce a pretty good with Mean Squared
Error of 0.006717 with the value of correlation coefficient, r, of 0.90764
respectively. In short, the NARX technique has been proven to learn
network traffic effectively with an acceptable predictive accuracy result
obtained.
Keywords—NARX; network traffic; MSE; correlation
coefficient
I. INTRODUCTION
Time series analysis tools that are used for modelling and
forecasting time series datasets are widely used in various
fields including economic field (i.e. business, finance, foreign
exchange, and stock problems), investment, engineering,
energy, internet, and network traffic. Indeed, an accurate
prediction ability is highly required in order to assist the
process of decision making. In the literature review, numerous
strategies have been established in the general framework of
time series prediction. These techniques can be grouped into
two main categories: statistical and machine learning (ML)
methods. There are several types of methods that are derived
from the statistics such as autoregressive (AR), moving
average (MA), autoregressive moving-average (ARMA),
autoregressive integrated moving-average (ARIMA),
generalized autoregressive conditional heteroskedasticity
(GARCH), and seasonal autoregressive integrated moving-
average (SARIMA). Statistical methods are reliable enough to
be used in forecasting, if the amount of data is not too much
with linear data types. Meanwhile, the results of forecasting
have been less accurate when using a lot of data, due to the fact
that the mathematical model generated is quite complicated,
and difficult to be implemented by using a nonlinear data type
[1-3].
On the other hand, machine learning (ML) has been also
besides these statistical models. For instance, the Artificial
Neural Networks (ANN) is one of the ML methods, in which it
is widely used for analyzing and forecasting time series data in
the past four decades. Additionally, many researchers have
been using ANN widely as a time series analysis method to
solve problems due to its efficiency in solving linear and
nonlinear problems [4-6]. Among the ANN extension methods
include the multilayer perceptron’s with back propagation
(BP), recurrent neural networks (RNN), and a radial basis
function (RBF) neural network, that can provide efficient and
accurate forecasting, also being able to analyze especially by
using nonlinear data as a representation of the real world [1, 2,
7-12]. The motivation of this paper is to present a topology and
training scheme of a neural network that is able to forecast the
network traffic with some degree of accuracy using a one-step
ahead prediction. It is hoped that this paper can provide
insights to support network engineer management in providing
an efficient bandwidth traffic control management for the
campus communities. This paper will study the Nonlinear Auto
Regressive with eXogenous input neural network (NARX)
model, in order to address the issue of time series data that has
non-linear characteristics. Section II describes the methodology
used in this work. Section III outlines the experimental setup.
Section IV presents the analysis and discussion results, and
Section V concludes this paper.
II. METHODOLOGY
In this section, related works on the general network traffic
prediction models will be presented, including the time series
analysis performed by using the NARX model.
A. Time Series
A time series dataset is a dataset that consists of
observations ordered in time. In principle, time series model is
used to predict the values of data based
on the data [13]. In this study, the time
series dataset is obtained from the ICT server of Universitas
Mulawarman. Each network traffic data was captured by using
the CACTI software from 20 – 26 June 2013 (314 samples
series data). The dataset and plot dataset are shown in Table I
and Fig. 1.
978-1-4799-8386-5/15/$31.00 ©2015 IEEE 164
2015 International Conference on Science in Information Technology (ICSITech)