Finer grain size increases effects of error and changes inuence of environmental predictors on species distribution models Brice B. Hanberry University of Missouri, 203 Natural Resources Building, Columbia, MO 65211, USA abstract article info Article history: Received 14 May 2012 Received in revised form 21 February 2013 Accepted 22 February 2013 Available online 5 March 2013 Keywords: Minimum mapping unit Pseudoabsences Resolution Soil Topographic Zoning Spatial resolution and zoning affect models and predictions of species distribution models. I compared grain sizes of 90 m grid cells to ecological units of soil polygons (approximately 209 ha composed of discontinuous polygons of 16 ha), and then introduced error into samples and examined inuence of topographic and soil variables. I used random forests, which is a machine learning classier, and open access data. Predictions based on 90 m grid cells were slightly more accurate than coarser-sized polygons, particularly false positive rates (mean values of 0.11 and 0.16, respectively). The trade-off for accuracy was the number of mapping units required to increase resolution. Probability of presence decreased with resolution. Similarly to grain size comparisons, error affected probability of presence more than accuracy of prediction. Unlike grain size comparisons, the relationship between count of each species (i.e., relative abundance) and area predicted as present was lost with addition of error. Introduction of absences into the modeling sample of presences through plot location error increased probability of presence and introduction of presences into the modeling sample of absences through use of background pseudoabsences decreased probability of presence. Finer resolution amplied the effect of background absences; area predicted for presence was reduced by a factor of 5.4 for grid cells and 1.4 for soil polygons. The choice of ne resolution grid cells or coarser shaped poly- gons resulted in different models, due to varying inuence of topographic variables on models. Use of coarser resolution (tens to hundreds of hectares) may be a worthwhile exchange for greater spatial extent of species distribution models and use of ecologically zoned polygons appeared to avoid the modiable areal unit problem. © 2013 Elsevier B.V. All rights reserved. 1. Introduction Species distribution models provide continuous maps of survey data. Choice of scale for both study extent and spatial resolution of analysis affects estimates and comparisons among studies (Dungan et al., 2002; Rahbek, 2005). For species distribution models, grain is the area of the spatial unit (or minimum mapping unit) for which probability of presence is predicted (McGeoch and Gaston, 2002). Decreasing grain to a ner area results in lower rates of presence and increasing the grain to a coarser area increases the rates of presence exponentially (He and Gaston, 2000). Optimal grain size does not have a gold standard and depends on research objectives, taxa, study region, and perhaps most importantly, the grain size of environmental predictors (Guisan and Thuiller, 2005). Although it may appear that the nest grain size would provide the best information about species distributions, models are limited by quality and resolution of the raw survey data (Gottschalk et al., 2011; Lawes and Piper, 1998). In addition, data can contain locational error, which may be worsened by poor matches with ne resolution environmental predictors or corrected by coarser grain (Graham et al., 2004; Guisan et al., 2007). In contrast, too coarse a grain may produce disparities between species and vegetation types or smoothed topo- graphic features, i.e., the oasis effect (Gottschalk et al., 2011; Lawes and Piper, 1998). Furthermore, grain may no longer be relevant if scaled up or down beyond reasonable scales for conservation or management goals (Huettmann and Diamond, 2006). If spatial resolution is arbitrary, environmental variables will be aggregated into different sizes (scaling) or spatial arrangements (zoning), changing mean and/or variance (the Modiable Areal Unit Problem; Jelinski and Wu, 1996; Openshaw and Taylor, 1979). Spe- cies distribution models are based on a variety of variable types, in- cluding topographic variables from digital elevation models and soil variables, with different grain sizes. Variables from digital elevation models appear to have unlimited and modiable resolution that is systematic rather than ecologically meaningful (Dark and Bram, 2007; Hay et al., 2001). Conversely, discontinuous soil polygons based on similar soil characteristics (i.e., ecologically meaningful) pro- duce soil map units (or basic entities) of large and varied shape and size. Converting soil variables to grid cells or topographic variables to soil polygons will produce distortion (i.e., change the mean and variance). Furthermore, changing the resolution of species distribu- tion models affects importance of environmental predictors (Rahbek and Graves, 2001). Ecological Informatics 15 (2013) 813 Corresponding author. Tel.: +1 573 875 5341x230; fax: +1 573 882 1977. E-mail address: hanberryb@missouri.edu. 1574-9541/$ see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ecoinf.2013.02.003 Contents lists available at SciVerse ScienceDirect Ecological Informatics journal homepage: www.elsevier.com/locate/ecolinf