Finer grain size increases effects of error and changes influence of environmental
predictors on species distribution models
Brice B. Hanberry ⁎
University of Missouri, 203 Natural Resources Building, Columbia, MO 65211, USA
abstract article info
Article history:
Received 14 May 2012
Received in revised form 21 February 2013
Accepted 22 February 2013
Available online 5 March 2013
Keywords:
Minimum mapping unit
Pseudoabsences
Resolution
Soil
Topographic
Zoning
Spatial resolution and zoning affect models and predictions of species distribution models. I compared grain
sizes of 90 m grid cells to ecological units of soil polygons (approximately 209 ha composed of discontinuous
polygons of 16 ha), and then introduced error into samples and examined influence of topographic and soil
variables. I used random forests, which is a machine learning classifier, and open access data. Predictions
based on 90 m grid cells were slightly more accurate than coarser-sized polygons, particularly false positive
rates (mean values of 0.11 and 0.16, respectively). The trade-off for accuracy was the number of mapping
units required to increase resolution. Probability of presence decreased with resolution. Similarly to grain
size comparisons, error affected probability of presence more than accuracy of prediction. Unlike grain size
comparisons, the relationship between count of each species (i.e., relative abundance) and area predicted
as present was lost with addition of error. Introduction of absences into the modeling sample of presences
through plot location error increased probability of presence and introduction of presences into the modeling
sample of absences through use of background pseudoabsences decreased probability of presence. Finer
resolution amplified the effect of background absences; area predicted for presence was reduced by a factor
of 5.4 for grid cells and 1.4 for soil polygons. The choice of fine resolution grid cells or coarser shaped poly-
gons resulted in different models, due to varying influence of topographic variables on models. Use of coarser
resolution (tens to hundreds of hectares) may be a worthwhile exchange for greater spatial extent of species
distribution models and use of ecologically zoned polygons appeared to avoid the modifiable areal unit problem.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction
Species distribution models provide continuous maps of survey
data. Choice of scale for both study extent and spatial resolution of
analysis affects estimates and comparisons among studies (Dungan
et al., 2002; Rahbek, 2005). For species distribution models, grain is
the area of the spatial unit (or minimum mapping unit) for which
probability of presence is predicted (McGeoch and Gaston, 2002).
Decreasing grain to a finer area results in lower rates of presence and
increasing the grain to a coarser area increases the rates of presence
exponentially (He and Gaston, 2000).
Optimal grain size does not have a gold standard and depends on
research objectives, taxa, study region, and perhaps most importantly,
the grain size of environmental predictors (Guisan and Thuiller,
2005). Although it may appear that the finest grain size would provide
the best information about species distributions, models are limited
by quality and resolution of the raw survey data (Gottschalk et al.,
2011; Lawes and Piper, 1998). In addition, data can contain locational
error, which may be worsened by poor matches with fine resolution
environmental predictors or corrected by coarser grain (Graham et al.,
2004; Guisan et al., 2007). In contrast, too coarse a grain may produce
disparities between species and vegetation types or smoothed topo-
graphic features, i.e., the oasis effect (Gottschalk et al., 2011; Lawes
and Piper, 1998). Furthermore, grain may no longer be relevant if scaled
up or down beyond reasonable scales for conservation or management
goals (Huettmann and Diamond, 2006).
If spatial resolution is arbitrary, environmental variables will
be aggregated into different sizes (scaling) or spatial arrangements
(zoning), changing mean and/or variance (the Modifiable Areal Unit
Problem; Jelinski and Wu, 1996; Openshaw and Taylor, 1979). Spe-
cies distribution models are based on a variety of variable types, in-
cluding topographic variables from digital elevation models and soil
variables, with different grain sizes. Variables from digital elevation
models appear to have unlimited and modifiable resolution that
is systematic rather than ecologically meaningful (Dark and Bram,
2007; Hay et al., 2001). Conversely, discontinuous soil polygons
based on similar soil characteristics (i.e., ecologically meaningful) pro-
duce soil map units (or basic entities) of large and varied shape and
size. Converting soil variables to grid cells or topographic variables
to soil polygons will produce distortion (i.e., change the mean and
variance). Furthermore, changing the resolution of species distribu-
tion models affects importance of environmental predictors (Rahbek
and Graves, 2001).
Ecological Informatics 15 (2013) 8–13
⁎ Corresponding author. Tel.: +1 573 875 5341x230; fax: +1 573 882 1977.
E-mail address: hanberryb@missouri.edu.
1574-9541/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.ecoinf.2013.02.003
Contents lists available at SciVerse ScienceDirect
Ecological Informatics
journal homepage: www.elsevier.com/locate/ecolinf