Are pollen-based climate models improved by combining surface samples from soil and lacustrine substrates? Simon Goring a, ⁎, Terri Lacourse b , Marlow G. Pellatt c,d , Ian R. Walker e , Rolf W. Mathewes a a Department of Biological Sciences, Simon Fraser University, Burnaby, Canada V5A 1 S6 b Department of Geography, University of Victoria, Victoria, British Columbia, Canada V8W 3R4 c Parks Canada, Western and Northern Service Centre, 300-300 West Georgia Street, Vancouver, British Columbia, Canada V6B 6B4 d School of Resource and Environmental Management, Simon Fraser University, Burnaby, Canada V5A 1 S6 e Biology, and Earth & Environmental Sciences, University of British Columbia Okanagan, Kelowna, BC, Canada V1V 1 V7 abstract article info Article history: Received 11 December 2009 Received in revised form 29 June 2010 Accepted 30 June 2010 Available online 7 July 2010 Keywords: pollen databases depositional environment lacustrine terrestrial palynology climate models weighted averaging randomForest modern analogue technique partial least squares non-metric-multidimensional-scaling Differences between pollen assemblages obtained from lacustrine and terrestrial surface sediments may affect the ability to obtain reliable pollen-based climate reconstructions. We test the effect of combining modern pollen samples from multiple depositional environments on various pollen-based climate reconstruction methods using modern pollen samples from British Columbia, Canada and adjacent Washington, Montana, Idaho and Oregon states. This dataset includes samples from a number of depositional environments including soil and lacustrine sediments. Combining lacustrine and terrestrial (soil) samples increases root mean squared error of prediction (RMSEP) for reconstructions of summer growing degree days when weighted-averaging partial-least-squares (WAPLS), weighted-averaging (WA) and the non-metric-multidimensional-scaling/generalized-additive-models (NMDS/GAM) are used but reduces RMSEP for randomForest, the modern analogue technique (MAT) and the Mixed method, although a slight increase occurs for MAT at the highest sample size. Summer precipitation reconstructions using MAT, randomForest and NMDS/GAM suffer from increased RMSEP when both lacustrine and terrestrial samples are used, but WA, WAPLS and the Mixed method show declines in RMSEP. These results indicate that researchers interested in using pollen databases to reconstruct climate variables need to consider the depositional environments of samples within the analytical dataset since pooled datasets can increase model error for some climate variables. However, since the effects of the pooled datasets will vary between climate variables and between pollen-based climate reconstruction methods we do not reject the use of mixed samples altogether. We ﬁnish by proposing steps to test whether signiﬁcant reductions in model error can be obtained by splitting or combining samples from multiple substrates. © 2010 Elsevier B.V. All rights reserved. 1. Introduction Large surface-sample databases allow researchers to relate modern pollen distribution to regional and continental-scale climates (North America: Whitmore et al., 2005; Africa: Gajewski et al., 2002; Europe: Dormoy et al., 2009; China: Members of China Quaternary Pollen Data Base, 2001). The size of the dataset used in an analysis plays an important role in model quality. For chironomids it has been shown that more precise and accurate reconstructions of past environments from sedimentary archives become possible as dataset breadth across a climate gradient increases (Walker et al., 1997). Chironomid datasets are obtained only from lacustrine-type environ- ments and so may only be extended using other lacustrine samples. In contrast to chironomid datasets, pollen datasets can extend coverage of a region or climate gradient by including modern pollen assemblages obtained from several depositional environments. These depositional environments may include peat, soil or lacustrine sediments. If a climate reconstruction uses pollen assemblages obtained from lacustrine sediments, it may be possible to extend coverage along a climate or vegetation gradient by including pollen obtained from both lacustrine and soil samples. This may be of particular interest in dry regions such as grasslands where lacustrine depositional environments may be absent or limited. The literature makes it clear that the effects of mixed depositional environments have been of concern for some time, and most large databases contain information about depositional environment (Davis and Webb, 1975). Although extending the dataset in this way may seem desirable, the effect of mixing samples from multiple depositional environments on model error for pollen–climate reconstructions has yet to be examined in detail. Review of Palaeobotany and Palynology 162 (2010) 203–212 ⁎ Corresponding author. Tel.: +1 778 782 4458. E-mail address: sgoring@sfu.ca (S. Goring). 0034-6667/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.revpalbo.2010.06.014 Contents lists available at ScienceDirect Review of Palaeobotany and Palynology journal homepage: www.elsevier.com/locate/revpalbo