1222
ISSN 1064-2293, Eurasian Soil Science, 2020, Vol. 53, No. 9, pp. 1222–1233. © Pleiades Publishing, Ltd., 2020.
Prediction of Soil Properties Using Random Forest
with Sparse Data in a Semi-Active Volcanic Mountain
H. Piri Sahragard
a,
* and M. R. Pahlavan-Rad
b,
**
a
Rangeland and Watershed Department, Water and Soil Faculty, University of Zabol, Zabol, Iran
b
Soil and Water Research Department, Golestan Agricultural and Natural Resources Research and Education Center,
Agricultural Research, Education and Extension Organization (AREEO), Gorgan, Iran
*e-mail: hopiry@uoz.ac.ir
**e-mail: pahlavanrad@gmail.com
Received December 30, 2019; revised February 26, 2020; accepted April 1, 2020
Abstract—Understanding spatial variations of soil properties is necessary for the management of rangelands
vegetation ecosystem. The present study aimed to assess the spatial variations of soil properties in the hillslope
of the Taftan semi-active volcanic mountain, Sistan and Baluchestan Province, south-eastern Iran. The loca-
tions of 30 sampling points were determined using random - systematic method and soil samples were taken
from two depths: 0–30 and 30–60 cm. Spatial distribution of soil properties and relationships between soil
properties and covariates were investigated using Random forest method. Model validation was done through
10-fold cross-validation approach. Based on results elevation, channel network base level and vertical dis-
tance to channel network, were the most importance environmental variables in predicting of the some soil
characteristics such as soil clay, silt, sand, SOC, and EC in two studied depths. The maps produced indicated
higher clay at 30–60 depth in the higher elevations. EC amounts were increased in the lower parts of the moun-
tain because of leaching. Furthermore, the highest map accuracy was related to EC map at both depths and clay
at 30–60 depth. The prediction maps of other properties of soil had low accuracy.
Keywords: DEM, environmental variables, Taftan, Random forest
DOI: 10.1134/S1064229320090136
INTRODUCTION
Assessing the spatial variation of organic carbon
(OC), electrical conductivity (EC), and soil texture is
important in rangeland ecosystems of arid environ-
ments because of their effects on soil fertility, hydrau-
lic conductivity, infiltration rate, and erosion. These
characteristics can also influence plant species distri-
bution [31]. Furthermore, different plant species can
affect various soil properties significantly through
evacuating moisture, soil nutrient uptake, and carbon
stabilization [27–30]. Thus, knowledge on regularities
of soil spatial variation is necessary for sustainable veg-
etation management in the rangeland ecosystem,
especially mountainous landscape.
Determination of soil properties distribution in the
mountainous area is difficult because of sampling lim-
itations and the complex processes of soil formation.
The mountain areas have heterogeneous environ-
ments and shallow soils [18]. Here, topography and
local climate are important factors in controlling soil
properties such as organic matter (OM) [15]. Due to
soil carbon turnover and geomorphology relationship,
different pattern of spatial SOC distribution has been
reported across landscapes [12]. Besides, altitude had
a negative effect on SOC contents in different geo-
graphical aspect. In other words, with increasing alti-
tude, under different aspects, the SOC content will be
reduced. [4]. Thereby, altitude variation can justify
change in soil SOC stocks [32]. Due to soil significant
susceptibility to erosion, SOC amount is differ from
each other in different land uses in a Mediterranean
cultivated field [35].
Digital soil mapping (DSM), as a powerful tool,
can determine soil and environmental variables rela-
tionships, thereby, spatial variations of soil properties
in the desired area [23]. Different statistical models
are used to make this relationship such as regression
trees [39], artificial neural networks [22], generalized
additive models [11], geographically weighted regression
kriging [19], random forest [16–29] and Cubist [45].
The random forest (RF) has a high prediction per-
formance and a random selection of variable to gener-
ate each decision tree is appropriate for soil spatial dis-
tribution modeling [6]. The comparison of prediction
accuracy of RF and other models to predict soil char-
acteristics has shown that RF has good performance in
many studies. Accordingly, comparison study in Africa
implies on the superiority of RF over the linear regres-
GENESIS AND GEOGRAPHY
OF SOILS