Use of different approaches to model presence/absence of Salmo marmoratus in Piedmont (Northwestern Italy) Tina Tirelli a, , Luca Pozzi b , Daniela Pessani a a Dipartimento di Biologia Animale e dell'Uomo, Via Accademia Albertina 13-10123 Torino, Italy b New York University, Department of Anthropology, 25 Waverly Place, New York, NY10003, USA abstract article info Article history: Received 16 March 2009 Received in revised form 10 July 2009 Accepted 13 July 2009 Keywords: Discriminant function analysis Logistic regression Decision tree Articial neural network Sensitivity analysis Species prediction In Piedmont (Italy) the environmental changes due to human impact have had profound effects on rivers and their inhabitants. Thus, it is necessary to develop practical tools providing accurate ecological assessments of river and species conditions. We focus our attention on Salmo marmoratus, an endangered salmonid which is characteristic of the Po river system in Italy. In order to contribute to the management of the species, four different approaches were used to assess its presence: discriminant function analysis, logistic regression, decision tree models and articial neural networks. Either all the 20 environmental variables measured in the eld or the 7 coming from feature selection were used to classify sites as positive or negative for S. marmoratus. The performances of the different models were compared. Discriminant function analysis, logistic regression, and decision tree models (unpruned and pruned) had relatively high percentages of correctly classied instances. Although neither tree-pruning technique improved the reliability of the models signicantly, they did reduce the tree complexity and hence increased the clarity of the models. The articial neural network (ANN) approach, especially the model built with the 7 inputs coming from feature selection, showed better performance than all the others. The relative contribution of each independent variable to this model was determined by using the sensitivity analysis technique. Our ndings proved that the ANNs were more effective than the other classication techniques. Moreover, ANNs achieved their high potentials when they were applied in models used to make decisions regarding river and conservation management. © 2009 Elsevier B.V. All rights reserved. 1. Introduction Globally, freshwaters are rapidly deteriorating and hence these systems are receiving increasing attention (Allan and Flecker, 1993; Matson et al., 1997; Postel, 2000). In Italy, and especially in Piedmont, there has been considerable impact of human activities on rivers. Nutrient balances have been altered both with agricultural run-offs and urban sewage discharges. Sediment inputs have increased through a combination of deforestation, oods, and road building. These changes have had profound effects on rivers and their inhabitants. Thus, there is a need for the development of practical tools providing accurate ecological assessments of river and species conditions, ultimately in order to develop measures allowing habitat and species preservation. Moreover, we need to nd out in depth the relationship between the environment and the occurrence of the organisms inhabiting rivers and streams. This is fundamental for conservation management and river restoration. To reach these goals, modeling is becoming a more and more important tool for perfecting decision-making and management policies. Freshwater modeling has made substantial progress over the last decade. Still, these ecosystems are very complex and hence hard to understand despite the substantial improvements made in ecosystem modeling and computation (Recknagel, 2002) and despite the development of highly reliable models. Over the last several years, researchers have been applying machine learning methods to ecology more and more (Lek and Guégan, 1999; Debeljak et al., 2001; Recknagel, 2001; Dzeroski and Todorovski, 2003; Dakou et al., 2007; Goethals et al., 2007; Lencioni et al., 2007; Pivard et al., 2008). In fact, ecosystems characteristically show highly complex nonlinear relationships among their input variables. Thus machine learning techniques offer several advantages over traditional statistical analysis. Principally, they introduce fewer prior assumptions about the relationships among the variables. There are many machine learning techniques that could be applied, but decision trees (Quinlan, 1986), articial neural networks (Lek and Guégan, 1999), fuzzy logic (Barros et al., 2000), and Bayesian belief networks (Adriaenssens et al., 2004) are seemingly the most effective for habitat suitability modeling, as has been demonstrated (Goethals and De Pauw, 2001; Dakou et al., 2007). In the present study we focus our attention on the marble trout Salmo marmoratus (Cuvier, 1817), an endangered salmonid that can be distinguished from other Salmo species on the basis of its color pattern Ecological Informatics 4 (2009) 234242 Corresponding author. Tel.: +39 011 6704538; fax: +39 011 6704508. E-mail address: santina.tirelli@unito.it (T. Tirelli). 1574-9541/$ see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2009.07.003 Contents lists available at ScienceDirect Ecological Informatics journal homepage: www.elsevier.com/locate/ecolinf