CiiT International Journal of Artificial Intelligent Systems and Machine Learning, Vol 4, No 4, April 2012 223 0974-9667/CIIT–IJ-3101/07/$20/$100 © 2012 CiiT Published by the Coimbatore Institute of Information Technology Abstract---For the planning, land use, design of civil projects and water resources management, the accurate prediction of hydrological behaviour in the watershed can provide valuable information. Hydrologic systems include, to a large extent, stochastic components and are often non-linear and non-stationary. Inspite of high adaptability of Artificial Neural Network (ANN) in modelling hydrologic time series, often signals are highly non-stationary and exhibit seasonal irregularity. In such cases, prediction accuracy of ANN suffers for want of pre-processing of data. In this study, different data pre-processing techniques are presented to deal with irregularity components that exist in hydrologic time series data of the Brahmaputra basin within India at the Pancharatna gauging station using daily time unit and their properties are evaluated by performing one step ahead flow forecasting using ANN. The model results are evaluated by using Root mean square error (RMSE)and Mean absolute percentage error(MAPE) and it was found that Logarithm based pre-processing technique provides better forecasting performance among various pre-processing techniques. The results indicate that detecting non-stationary nature and selecting an appropriate pre-processing technique is highly beneficial in improving the prediction performance of ANN model. Keywords---ANN, Non-Stationary, Data Pre-Processing, Activation Function, Time Series. I. INTRODUCTION N time series analysis it is common to assume that time series data has constant mean and variance.i.e. they are stationary. This is generally true except when abrupt data changes occur resulting in non-stationary values in variance, or there is a trend in the series, resulting in non-stationary mean. Pre-processing techniques, facilitate stabilization of the mean and variance, and remove seasonality in data used to build soft computing models. Recently ANN has shown great ability in modeling and forecasting nonlinear hydrologic time series. Although classic time series models like autoregressive moving average (ARMA) are widely used for hydrological time series forecasting, they are based on linear models assuming the data are stationary and have limited ability to capture non-stationarities and non-linearities in hydrologic data. [1] Manuscript received on March 19, 2012, review completed on April 04, 2012 and revised on April 07, 2012. Aniruddha Gopal Banhatti, Research Scholar, Department of Applied Mechanics, National Institute of Technology, Karnataka, Surathkal, India. E-Mail: anibanister@gmail.com Paresh Chandra Deka, Associate Professor, Department of Applied Mechanics, National Institute of Technology, Karnataka, Surathkal, India. E-Mail: pareshdeka@yahoo.com Digital Object Identifier No: AIML042012010. have demonstrated the effects of non-stationary on ANN prediction on economic time serie. ANNs are found suitable for handling huge amounts of dynamic, nonlinear and noisy data when underlying physical relationships are not fully understood [2]. In spite of high flexibility of ANN in modeling hydrologic time series, sometimes signals are highly nonstationary and exhibit seasonal irregularity. In such cases, ANN may not be able to cope with non-stationary data if preprocessing of input and/or output data is not performed [3]. Based on study done by [4], data pre-processing is one of the most important steps for developing an ANN model for prediction .They have presented three data pre-processing strategies and gave the advantages, disadvantages and compare the results of each approach. [5] Have studied the effect of different size of data sampling on the performance of ANN’s learning and generalization ability. Data pre-processing is an important step in developing ANN application, which could affect model accuracy and results. Pre-processing refers to analyzing and transforming input and output variables in order to detect trends, minimize noise, underline important relationship and flatten the variables’ distribution. These analyses and transformations help the model learn relevant patterns. Also, due to the chaotic nature of data, values of time series can vary between wide ranges within very short period of time. This can cause a great difficulty to ANN, which can get disturbed by the large fluctuations in the value. Furthermore, activation functions used by ANN is bounded, causing in this way, inconsistencies in both training and prediction phases. To avoid this pitfall, the data is usually scaled between 0 and 1 or -1 to +1, so it is consistent with the type of transfer function being used. The noise reduction in a time series data is also very important as the outlier can influence in a bad way the quality of the results. For this purpose, it is possible to use the logarithmic transformation (natural log).Logarithmic transformation is useful in correcting asymmetry to the right present in the data distribution. This transformation also allows multiplicative relations to be converted into additive relations, which simplifies and improves data modeling. Before data are used by an algorithm, these must go through several transformations in order to prepare the input. The success of an algorithm greatly depends on the quality of input data. As different methods can handle only different samples, it is proposed to exploit certain data features with the purpose of finding out which pre-processing transformation works best. Performance Evaluation of Artificial Neural Network Model using Data Preprocessing in Non-Stationary Hydrologic Time Series Aniruddha Gopal Banhatti and Paresh Chandra Deka I