CiiT International Journal of Artificial Intelligent Systems and Machine Learning, Vol 4, No 4, April 2012 223
0974-9667/CIIT–IJ-3101/07/$20/$100 © 2012 CiiT Published by the Coimbatore Institute of Information Technology
Abstract---For the planning, land use, design of civil projects and
water resources management, the accurate prediction of hydrological
behaviour in the watershed can provide valuable information.
Hydrologic systems include, to a large extent, stochastic components
and are often non-linear and non-stationary. Inspite of high
adaptability of Artificial Neural Network (ANN) in modelling
hydrologic time series, often signals are highly non-stationary and
exhibit seasonal irregularity. In such cases, prediction accuracy of
ANN suffers for want of pre-processing of data. In this study, different
data pre-processing techniques are presented to deal with irregularity
components that exist in hydrologic time series data of the
Brahmaputra basin within India at the Pancharatna gauging station
using daily time unit and their properties are evaluated by performing
one step ahead flow forecasting using ANN. The model results are
evaluated by using Root mean square error (RMSE)and Mean absolute
percentage error(MAPE) and it was found that Logarithm based
pre-processing technique provides better forecasting performance
among various pre-processing techniques. The results indicate that
detecting non-stationary nature and selecting an appropriate
pre-processing technique is highly beneficial in improving the
prediction performance of ANN model.
Keywords---ANN, Non-Stationary, Data Pre-Processing,
Activation Function, Time Series.
I. INTRODUCTION
N time series analysis it is common to assume that time series
data has constant mean and variance.i.e. they are stationary.
This is generally true except when abrupt data changes occur
resulting in non-stationary values in variance, or there is a trend
in the series, resulting in non-stationary mean. Pre-processing
techniques, facilitate stabilization of the mean and variance,
and remove seasonality in data used to build soft computing
models.
Recently ANN has shown great ability in modeling and
forecasting nonlinear hydrologic time series. Although classic
time series models like autoregressive moving average
(ARMA) are widely used for hydrological time series
forecasting, they are based on linear models assuming the data
are stationary and have limited ability to capture
non-stationarities and non-linearities in hydrologic data. [1]
Manuscript received on March 19, 2012, review completed on April 04,
2012 and revised on April 07, 2012.
Aniruddha Gopal Banhatti, Research Scholar, Department of Applied
Mechanics, National Institute of Technology, Karnataka, Surathkal, India.
E-Mail: anibanister@gmail.com
Paresh Chandra Deka, Associate Professor, Department of Applied
Mechanics, National Institute of Technology, Karnataka, Surathkal, India.
E-Mail: pareshdeka@yahoo.com
Digital Object Identifier No: AIML042012010.
have demonstrated the effects of non-stationary on ANN
prediction on economic time serie. ANNs are found suitable for
handling huge amounts of dynamic, nonlinear and noisy data
when underlying physical relationships are not fully understood
[2]. In spite of high flexibility of ANN in modeling hydrologic
time series, sometimes signals are highly nonstationary and
exhibit seasonal irregularity. In such cases, ANN may not be
able to cope with non-stationary data if preprocessing of input
and/or output data is not performed [3]. Based on study done by
[4], data pre-processing is one of the most important steps for
developing an ANN model for prediction .They have presented
three data pre-processing strategies and gave the advantages,
disadvantages and compare the results of each approach. [5]
Have studied the effect of different size of data sampling on the
performance of ANN’s learning and generalization ability.
Data pre-processing is an important step in developing ANN
application, which could affect model accuracy and results.
Pre-processing refers to analyzing and transforming input and
output variables in order to detect trends, minimize noise,
underline important relationship and flatten the variables’
distribution. These analyses and transformations help the
model learn relevant patterns.
Also, due to the chaotic nature of data, values of time series
can vary between wide ranges within very short period of time.
This can cause a great difficulty to ANN, which can get
disturbed by the large fluctuations in the value. Furthermore,
activation functions used by ANN is bounded, causing in this
way, inconsistencies in both training and prediction phases. To
avoid this pitfall, the data is usually scaled between 0 and 1 or
-1 to +1, so it is consistent with the type of transfer function
being used.
The noise reduction in a time series data is also very
important as the outlier can influence in a bad way the quality
of the results. For this purpose, it is possible to use the
logarithmic transformation (natural log).Logarithmic
transformation is useful in correcting asymmetry to the right
present in the data distribution. This transformation also allows
multiplicative relations to be converted into additive relations,
which simplifies and improves data modeling.
Before data are used by an algorithm, these must go through
several transformations in order to prepare the input. The
success of an algorithm greatly depends on the quality of input
data. As different methods can handle only different samples, it
is proposed to exploit certain data features with the purpose of
finding out which pre-processing transformation works best.
Performance Evaluation of Artificial Neural Network
Model using Data Preprocessing in Non-Stationary
Hydrologic Time Series
Aniruddha Gopal Banhatti and Paresh Chandra Deka
I