Large topsoil organic carbon variability is controlled by Andisol properties and effectively assessed by VNIR spectroscopy in a coffee agroforestry system of Costa Rica Rintaro Kinoshita a, , Olivier Roupsard b,c , Tiphaine Chevallier d , Alain Albrecht d , Simon Taugourdeau e , Zia Ahmed f , Harold M. van Es a a School of Integrative Plant Science, Soil and Crop Sciences Section, Cornell University, Ithaca, NY 148531901, USA b CIRAD, UMR Eco&Sols (Ecologie Fonctionnelle & Biogéochimie des Sols et des Agro-écosystèmes), 34060 Montpellier, France c CATIE (Tropical Agricultural Centre for Research and Higher Education), 7170 Turrialba, Costa Rica d IRD, UMR Eco&Sols (Ecologie Fonctionnelle & Biogéochimie des Sols et des Agro-écosystèmes), 34060 Montpellier, France e CIRAD, UMR SELMET (Systèmes délevage méditerranéens et tropicaux), 34398 Montpellier, France f CIMMYT (International Maize and Wheat Improvement Center), Dhaka 1207, Bangladesh abstract article info Article history: Received 25 September 2014 Received in revised form 8 July 2015 Accepted 16 August 2015 Available online xxxx Keywords: Soil organic carbon VNIR spectroscopy Random Forest Co-kriging Andisols Allophane Agroforestry Assessing the spatial variability of soil organic carbon (SOC) is crucial for SOC monitoring and comparing man- agement options. Topsoil (05 cm) SOC concentrations were surveyed in a coffee agroforestry watershed (0.9 km 2 ) on Andisols in Costa Rica with uniform farm management. We encountered high values and large spa- tial variations of SOC, from 48.1 to 172 g kg -1 in the dry combustion set (SOC ref ; n = 72) used for calibrating the visible-near-infrared reectance spectroscopy (VNIRS) samples (SOC VNIRS ; 3502500 nm; n = 520). VNIRS using partial least squares regression was effective in predicting SOC (R 2 = 0.85; a root mean square error (RMSE) = 12.3 g kg -1 ) and proved an effective proxy measurement. We assessed several topographic, vegetation and andic soil property variables, of which only the latter (metalhumus complexes and allophanes) displayed strong correlations with SOC ref concentrations. We compared Random Forest and three geostatistical approaches for the interpolation of SOC in unsampled locations. Ordinary kriging with SOC ref yielded an RMSE of 28.0 g kg -1 . Random Forest was successful in incorporating many weakly and non-linearly correlated covariates with SOC (RMSE = 14.7 g kg -1 ), provided Al p (the sodium pyrophosphate extractable aluminum), the best predictor of SOC (r = 0.85) but also the most costly variable to acquire. Co-kriging with Al p also showed high reduction in RMSE (16.0 g kg -1 ). Co-kriging with SOC VNIRS only showed marginal reduction in RMSE to 24.2 g kg -1 due to the presence of a high nugget effect. Local variability of SOC in this volcanic agroforestry watershed was dominated by andic properties whereas topographic or vegetation variables had very little impact. Estimation of SOC variability is recommended using inexpensive proxy measurements like VNIRS (RMSE = 12.3 g kg -1 ) rather than spatial interpolation techniques. © 2015 Elsevier B.V. All rights reserved. 1. Introduction Soil organic carbon (SOC) is a fundamental property related to soil physical, chemical and biological quality and is an important com- ponent of the global carbon (C) cycle (Magdoff and van Es, 2009). Disruption of sustainable C cycles in agricultural soils has led to diminishing crop yields as well as contributing to further accelerating greenhouse gas (GHG) emissions (Hillel and Rosenzweig, 2010; Lal, 2006; Powlson et al., 2011). At a farm-scale, high spatial variation of SOC may occur, which causes uncertainty when comparing several management practices or when assessing the effectiveness of various soil conservation measures to restore SOC (Minasny et al., 2013). There is need for accurate approaches to assess the impact of man- agement on SOC at the farm-scale, whatever the inherent variability. Various biotic and abiotic variables have been identied to correlate with SOC at various spatial scales and soil environment, such as past and present land use (Schulp and Veldkamp, 2008), local terrain (Cambule et al., 2014; Thompson and Kolka, 2005), and vegetation (Bou Kheir et al., 2010; Horwath Burnham and Sletten, 2010; Kunkel et al., 2011; Takata et al., 2007). These correlated variables have been used to predict SOC through various methods such as multiple linear re- gression (Gessler et al., 2000; Thompson and Kolka, 2005), Random For- est (RF; Grimm et al., 2008), boosted regression tree (Razakamanarivo et al., 2011), co-kriging (Terra et al., 2004) and regression kriging Geoderma 262 (2016) 254265 Corresponding author at: 1007 Bradeld Hall, Cornell University, Ithaca, NY 148531901. E-mail address: rk422@cornell.edu (R. Kinoshita). http://dx.doi.org/10.1016/j.geoderma.2015.08.026 0016-7061/© 2015 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Geoderma journal homepage: www.elsevier.com/locate/geoderma