Feature extraction for time-series data: An artificial neural network evolutionary training model for the management of mountainous watersheds Thomas J. Glezakos à , Theodore A. Tsiligiridis, Lazaros S. Iliadis, Constantine P. Yialouris, Fotis P. Maris, Konstantinos P. Ferentinos Agricultural University of Athens, Department of Science, Laboratory of Informatics, 75 Iera Odos Street,11855 Athens, Hellas, Greece article info Available online 6 August 2009 Keywords: Genetic algorithms Artificial neural networks Maximum volume of water flow Average annual water supply Evolutionary time-series processing Genetic ANN training abstract The present manuscript is the result of research conducted towards a wider use of artificial neural networks in the management of mountainous water supplies. The novelty lies on the evolutionary clustering of time-series data which are then used for the training and testing of a neural object, applying meta-heuristics in the neural training phase, for the management of water resources and for torrential risk estimation and modelling. It is essentially an attempt towards the development of a more credible forecasting system, exploiting an evolutionary approach used to interpret and model the significance which time-series data pose on the behavior of the aforementioned environmental reserves. The proposed model, designed such as to effectively estimate the average annual water supply for the various mountainous watersheds, accepts as inputs a wide range of meta-data produced via an evolutionary genetic process. The data used for the training and testing of the system refer to certain watersheds spread over the island of Cyprus and span a wide temporal period. The method proposed incorporates an evolutionary process to manipulate the time-series data of the average monthly rainfall recorded by the measuring stations, while the algorithm includes special encoding, initialization, performance evaluation, genetic operations and pattern matching tools for the evolution of the time- series into significantly sampled data. & 2009 Published by Elsevier B.V. 1. Introduction The most common reason for a flood surge at a certain time and space is the condition at which bodies of water overflow, or tides rise inexorably, due to a significant amount of rainfall or, for some reason, an excessive snow thawing, which overloads the water capacities of nearby natural or artificial reservoirs. Flood is defined by the National Flood Insurance Program as an excess of water on land that is normally drier, or a general and temporary condition of partial or complete inundation of two or more acres of normally dry land area from overflow of inland or tidal water, or unusual and rapid accumulation or runoff of surface waters from any source, or mudflow, or collapse or subsidence of land along the shore of a lake or similar body of water as a result of erosion or undermining caused by waves or currents of water exceeding anticipated cyclical levels. Also, it is not necessary for a flood to happen near vast bodies of water. Flash floods can happen everywhere, independently of altitude, longitude or latitude, when large volumes of rainfalls happen within a short period of time in the same area. It is also common knowledge that torrential streams which overflow and run wild, can cause heavy floods, which become more dangerous as the ability of the soil to absorb water diminishes and as the average annual rain height increases. The flow and the power of the torrential stream is not so much dependant on the amount of water precipitation, as it is on its peaks at certain periods of the year. The fact that torrential surges happen at certain seasonal intervals has made the researchers contemplate on its analysis on various data sets accumulated in various ways. It is nowadays made clear that the water resources of a country play one of the lead roles in the well-being of its citizens and are very important for its sustainable development, while their management is considered a crucial issue. The most profound way to manage the flow of torrential streams and the flood risk they pose on the environment is delivered by time- series analysis, where the monthly rainfall is considered as the most basic element. Other factors are considered as well, such as the landscape type and structure, the altitude of the stream, the surface of the watershed, the land use and land cover and so on, but they are not as crucial as the time-series one. ARTICLE IN PRESS Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/neucom Neurocomputing 0925-2312/$ - see front matter & 2009 Published by Elsevier B.V. doi:10.1016/j.neucom.2008.08.024 à Corresponding author. Tel.: +302105294181. E-mail address: t_glezakos@yahoo.com (T.J. Glezakos). Neurocomputing 73 (2009) 49–59