Research papers A stepwise model to predict monthly streamflow Anas Mahmood Al-Juboori a,b , Aytac Guven a, a Civil Engineering Department, Gaziantep University, 27310 Gaziantep, Turkey b Dams and Water Resources Research Center, Mosul University, Iraq article info Article history: Received 13 April 2016 Received in revised form 28 September 2016 Accepted 2 October 2016 Available online xxxx This manuscript was handled by G. Syme, Editor-in-Chief, with the assistance of Hazi Azamathulla, Associate Editor Keywords: Monthly streamflow Gene Expression Programming Generalized Reduced Gradient Optimization Markovian model ARIMA abstract In this study, a stepwise model empowered with genetic programming is developed to predict the monthly flows of Hurman River in Turkey and Diyalah and Lesser Zab Rivers in Iraq. The model divides the monthly flow data to twelve intervals representing the number of months in a year. The flow of a month, t is considered as a function of the antecedent month’s flow (t À 1) and it is predicted by multi- plying the antecedent monthly flow by a constant value called K. The optimum value of K is obtained by a stepwise procedure which employs Gene Expression Programming (GEP) and Nonlinear Generalized Reduced Gradient Optimization (NGRGO) as alternative to traditional nonlinear regression technique. The degree of determination and root mean squared error are used to evaluate the performance of the proposed models. The results of the proposed model are compared with the conventional Markovian and Auto Regressive Integrated Moving Average (ARIMA) models based on observed monthly flow data. The comparison results based on five different statistic measures show that the proposed stepwise model performed better than Markovian model and ARIMA model. The R 2 values of the proposed model range between 0.81 and 0.92 for the three rivers in this study. Ó 2016 Published by Elsevier B.V. 1. Introduction Monthly streamflow prediction is an important issue in water resources management, reservoir operation, hydropower projects, water supply, etc. Many methodologies have been developed to improve monthly flow forecasting according to the past measure- ments. There is no single method that can perform well for all basins, therefore, for a given watershed; there are different tech- niques that model the different physical behavior of the watershed. In recent decades, artificial intelligence (AI) techniques have been widely used in modeling hydrological phenomena. A number of researches have been developed in order to find the accurate and applicable models (Yilmaz et al., 2011; Huo et al., 2012; Meshgi et al., 2015; Kisi and Parmar, 2016). Gene Expression Programming (GEP) became popular among the AI techniques in various fields of water resources and geo- science. GEP is a symbolic regression algorithm to form mathemat- ical functions alternative to traditional nonlinear regression techniques and autoregressive models (Guven, 2009; Guven and Talu, 2010; Traore and Guven, 2013; Karimi et al., 2015). GEP algo- rithm is an extension to the genetic programming (GP) that was invented by Ferreira (2001). The basic difference between GEP and GP is represented by computer programming. GP programs (individuals) are non-linear entities of different sizes and shapes (parse trees); and in GEP the programs are also non-linear entities of different sizes and shapes (expression trees), but these complex entities are encoded as simple strings of fixed length chromosomes (Ferreira, 2001, 2006). The form of GEP function is not fixed unlike the traditional linear and non-linear regression. GEP uses a genetic evolution algorithm to fit the data to obtain an optimum form of a mathematical function (Fernando et al., 2012). The resultant GEP program (solution) for the corresponding problem is automatically generated by coding the expression as a tree structure with nodes (function) and leaves (terminal). A fit- ness function is used to evaluate the generated candidates to reproduce with modification, leaving progeny with new traits. The candidates of this new generation are, in their turn, subjected to the same developmental process: expression of the genomes, confrontation of the selection environment, and reproduction with modification. The process is repeated for a certain number of gen- erations or until a solution has been found (Ferreira, 2001). The GEP code is very simple. The relation between the symbols of the nodes and chromosome is represented in the trees in one to one relation. GEP genes are composed of a head and a tail. The head contains symbols that represent both functions (+,À,/,/,power,x 2 , etc.) and terminals (inputs or constants), whereas the tail contains http://dx.doi.org/10.1016/j.jhydrol.2016.10.006 0022-1694/Ó 2016 Published by Elsevier B.V. Corresponding author. E-mail address: aguven@gantep.edu.tr (A. Guven). Journal of Hydrology xxx (2016) xxx–xxx Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol Please cite this article in press as: Mahmood Al-Juboori, A., Guven, A. A stepwise model to predict monthly streamflow. J. Hydrol. (2016), http://dx.doi.org/ 10.1016/j.jhydrol.2016.10.006