Global NEST Journal, Vol 14, No 3, pp 354-361, 2012 Copyright© 2012 Global NEST Printed in Greece. All rights reserved ASSOCIATIONS BETWEEN STREAM FLOW AND CLIMATIC VARIABLES AT KIZILIRMAK RIVER BASIN IN TURKEY F. DADASER-CELIK 1* 1 Department of Environmental Engineering M. CELIK 2 Erciyes University, Kayseri, Turkey A.S. DOKUZ 2 2 Department of Computer Engineering Erciyes University, Kayseri, Turkey Received: 05/12/11 *to whom all correspondence should be addressed: Accepted: 27/07/2012 e-mail: fdadaser@erciyes.edu.tr ABSTRACT This study aims to demonstrate the use of association analysis for discovering the relationships between stream flow and climatic variables in the Kızılırmak River Basin in Turkey. Association analysis is a data mining technique that aims to discover rules in the form of AB that may occur in large datasets with frequency above a given threshold. A and B can be defined as events of a certain type, with the rule if A occurs then B occurs. In this study, A refers to climatic variable(s) (i.e., precipitation, temperature, wind speed, relative humidity) of certain magnitude, and B refers to the magnitude of stream flow. The interesting rules were quantified using support and confidence measures. Stream-flow data from three gauging stations in the Kızılırmak River Basin and climate data from three weather stations in the same basin were included in the analyses. All data were first segregated into three groups that were named as low, medium, and high. Low and high ranges of stream-flow data were further divided into three to increase our focus on extreme events. The analyses were conducted at the annual and seasonal timescales. The analyses indicated that the relationships between precipitation and temperature and stream flow are most prevalent but, relative humidity and wind speed are also important determinants of stream flow in the Kızılırmak River Basin. KEYWORDS: stream flow, climate, data mining, association analysis, Kızılırmak River Basin. 1. INTRODUCTION Identification of the relationships between hydrologic and climatic variables is very important for many hydrologic applications, such as prediction of missing records, analysis of climate change impacts, and estimation of hydrologic responses in ungauged basins. Unfortunately, identification of these relationships is not a straight-forward process due to the characteristics of the data (Shekhar et al., 2009; Ganguly and Steinhaeuser, 2008) and complexity of hydroclimatic relationships: (1) hydrologic and climatic data are geographical data and have spatial and temporal correlations; (2) hydrologic and climatic data have nonlinear dependences, they have long memory in time and they have long-range or tele-connections in space; (3) the linkages between hydrologic and climatic data are based on complex physical processes that are difficult to conceptualize. Hydrological models have been developed to improve our understanding of hydrologic and climatic linkages, but they need local level information on hydrogeology, soils, topography, land-use, etc. This information is often hard to get and even more difficult to obtain when multiple locations are of interest, e.g., when a regional study is to be conducted; (4) hydrologic and climatic datasets in many areas include gaps and missing records, which poses a major problem in statistical analysis. In this study, the goal is to develop and apply a data mining technique, called association analysis, for discovering the relationships between hydrologic (i.e. stream flow) and climatic variables. Data mining aims to develop automatic or semi-automatic methods for discovering unforeseen, interesting, and meaningful relationships from heterogeneous and large datasets, which cannot be analyzed manually. With this approach, it is possible to extract cause-effect relationships, determine which variables have the strongest relationships to the problems of interest, and develop models that