Data integration model to assess soil organic carbon availability Ana Horta ⁎, Amílcar Soares Center for Natural Resources and Environment, Instituto Superior Técnico, Technical University of Lisbon, Lisbon, 1049-001 Lisbon, Portugal abstract article info Article history: Received 29 March 2010 Received in revised form 6 August 2010 Accepted 27 September 2010 Available online 20 October 2010 Keywords: Geostatistics Joint sequential simulation Bivariate distribution Soil organic carbon assessment Soil data acquisition and assessment are crucial phases in the evaluation of soil degradation scenarios. To overcome the lack of ﬁeld data, ﬂexible sampling approaches can be used to complement conventional soil sampling. For the assessment of soil quality, it is necessary to integrate different soil support data and to provide a coherent spatial characterization of soil properties. This study proposes a new model to combine soil data from two different supports: “point” data, which refers to the concentration measured in the topsoil layer, and “bulk” data, which refers to the concentration measured for the whole soil depth sampled. The method developed uses a geostatistical co-simulation algorithm based on the experimental bi-distribution between both types of soil supports to compute co-simulated values. This new approach was applied to assess Soil Organic Carbon (SOC) availability in the topsoil. The results were used to identify critical areas in the Left Margin of the Guadiana River; an area in the South of Portugal with a high susceptibility to desertiﬁcation. © 2010 Elsevier B.V. All rights reserved. 1. Introduction Field data acquisition is one of the most important tasks in soil degradation assessment studies. In practice, a conventional sampling campaign comprises a qualitative description of the soil proﬁle plus sampling per horizon, and further lab analysis for quantitative classiﬁcation of soil quality indicators. These sampling requirements make soil quality studies expensive, time-consuming and, as a result, often lacking ﬁeld data. Also, conventional soil sampling methods provide enough information to describe the vertical variability of soil properties but not their spatial continuity over a short distance. Therefore this last variable is extremely difﬁcult to evaluate when only conventional sampling data is available. One possible way to assess spatial continuity is to collect bulk, undisturbed soil samples of the ﬁrst 40 to 50 cm of soil (depending on ﬁeld conditions). In the ﬁrst stage of a soil campaign, bulk sampling is an alternative to sampling per horizon; it is faster, cheaper and allows for a more representative soil sampling over a larger area. Based on this assumption, this work presents a model to integrate data from both sampling approaches, namely: - “bulk” data concerning the concentration measured for the whole soil depth sampled (usually 40 to 50 cm of soil); and - “point” data, referring to a concentration measured in the topsoil layer (obtained by conventional soil sampling that starts by collecting soil from the ﬁrst 5, 10 or 20 cm of the topsoil layer). The algorithm developed to combine soil data from two different supports uses geostatistical co-simulation. Several papers discuss the importance of including geostatistics for the prediction/characterization of soil quality (see, for example, Heuvelink and Webster, 2001; Sun et al., 2003). In general, geostatis- tical applications in soil science use estimation algorithms to assess the spatial distribution of a soil attribute. Besides estimation methods, stochastic simulation algorithms have been applied in the environ- mental ﬁeld, in particular to soil quality characterization (Goovaerts, 2001). The advantages of using simulation over estimation has been discussed (Goovaerts, 2000) and can be resumed to the fact that estimation methods (kriging) minimize the estimation variance but fail to reproduce the spatial variability of main variables as it is revealed by the variograms, spatial covariances and histograms of experimental data. Moreover, stochastic simulation provides the spatial uncertainty associated with spatial estimates, a requirement in several soil studies involving impact studies and risk assessment (Goovaerts, 1999). To characterize more than one variable (attribute, soil property or pollutant) and reproduce their joint spatial pattern (given by co- variograms and bi-histograms), several co-simulation algorithms have been used: sequential multi-Gaussian co-simulation (Verly, 1993), multi-Gaussian co-simulation with collocated co-kriging (Almeida and Journel, 1994), co-simulation with LU decomposition method (Myers, 1988), and simulation of autocorrelation factors (Desbarats and Dimitrakopoulos, 2000; Boucher and Dimitrakopoulos, 2009). For most practical applications, the common stochastic co- simulation algorithm applied to a set of correlated variables is based on a sequential approach. Sequential co-simulation algorithms depend on the formalism used to establish the spatial model of the random variable. Sequential Gaussian Co-simulation (Almeida and Journel, 1994), Sequential Indicator Co-simulation (Goovaerts, 1997) Geoderma 160 (2010) 225–235 ⁎ Corresponding author. Tel.: +351 21 841 74 41; fax: +351 21 841 73 89. E-mail address: ahorta@ist.utl.pt (A. Horta). 0016-7061/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.geoderma.2010.09.026 Contents lists available at ScienceDirect Geoderma journal homepage: www.elsevier.com/locate/geoderma