Contents lists available at ScienceDirect Int J Appl Earth Obs Geoinformation journal homepage: www.elsevier.com/locate/jag Regional soil organic carbon prediction model based on a discrete wavelet analysis of hyperspectral satellite data Xiangtian Meng a,1 , Yilin Bao a,1 , Jiangui Liu c , Huanjun Liu a,b, *, Xinle Zhang a , Yu Zhang d , Peng Wang d , Haitao Tang a , Fanchang Kong a a School of Public Adminstration and Law, Northeast Agricultural University, Harbin, 150030, China b Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130012, China c Agriculture and Agri-Food Canada, Eastern Cereal and Oilseed Research Centre, 960 Carling Avenue, Ottawa, Ontario, K1A 0C6, Canada d Heilongjiang Province National Defence Science and Technology Institute, Harbin, 150030, China ARTICLE INFO Keywords: Soil organic carbon Hyperspectral satellite data Discrete wavelet analysis Spectral index Mapping ABSTRACT Most studies have the achieved rapid and accurate determination of soil organic carbon (SOC) using laboratory spectroscopy; however, it remains dicult to map the spatial distribution of SOC. To predict and map SOC at a regional scale, we obtained fourteen hyperspectral images from the Gaofen-5 (GF-5) satellite and decomposed and reconstructed the original reectance (OR) and the rst derivative reectance (FDR) using discrete wavelet transform (DWT) at dierent scales. At these dierent scales, as inputs, we selected the 3 optimal bands with the highest weight coecient using principal component analysis and chose the normalized dierence index (NDI), ratio index (RI) and dierence index (DI) with the strongest correlation with the SOC content using a contour map method. These inputs were then used to build regional-scale SOC prediction models using random forest (RF), support vector machine (SVM) and back-propagation neural network (BPNN) algorithms. The results in- dicated that: 1) at a low decomposition scale, DWT can eectively eliminate the noise in satellite hyperspectral data, and the FDR combined with DWT can improve the SOC prediction accuracy signicantly; 2) the method of selecting inputs using principal component analysis and a contour map can eliminate the redundancy of hy- perspectral data while retaining the physical meaning of the inputs. For the model with the highest prediction accuracy, the inputs were all derived from the wavelength range of SOC variations; 3) the dierences in pre- diction accuracy among the dierent prediction models are small; and 4) the SOC prediction accuracy using hyperspectral satellite data is greatly improved compared with that of previous SOC prediction studies using multispectral satellite data. This study provides a highly robust and accurate method for predicting and mapping regional SOC contents. 1. Introduction Soil is the largest terrestrial carbon reservoir in the biosphere, ac- counting for approximately 75 % of the total carbon pool in terrestrial ecosystems. Slight changes in carbon reserves may lead to signicant dierences in the atmospheric concentration of CO 2 , thus aecting the global climate (Luo et al., 2010). Accordingly, the monitoring and rapid digital mapping of soil organic carbon (SOC) are among the most im- portant endeavours for developing strategies to mitigate global warming (Jones et al., 2005). To date, most SOC measurements have been conducted in labs, and collecting soil samples is usually time consuming and destructive. Moreover, discrete soil samples cannot provide continuous information regarding the spatial characteristics of soil properties; thus, it is dicult to map SOC at regional and global scales (Taghizadeh-Mehrjardi et al., 2016). Rapidly quantifying the spatial distribution of SOC content has become a research focus, and remote sensing techniques and machine learning algorithms have be- come powerful assessment tools (Camera et al., 2017). Compared with laboratory SOC prediction approaches, methods using remote sensing data in the visible, near-infrared and shortwave infrared (VNIR/SWIR, 0.42.5 μm) domains have been found to be faster and more cost e- cient (Chang and Laird, 2002). https://doi.org/10.1016/j.jag.2020.102111 Received 17 December 2019; Received in revised form 7 February 2020; Accepted 3 March 2020 Corresponding author at: School of Public Adminstration and Law, Northeast Agricultural University, Harbin, 150030, China. E-mail addresses: mxt0123neau@yeah.net (X. Meng), byl1211neau@yeah.net (Y. Bao), jiangui.liu@canada.ca (J. Liu), huanjunliu@yeah.net (H. Liu), xinlezhang@yeah.net (X. Zhang), 77180384@qq.com (Y. Zhang), 395845996@qq.com (P. Wang), 1275715966@qq.com (H. Tang), kfc199551@126.com (F. Kong). 1 First author: I express gratitude to my partner Yilin Bao, without her eort, this research could not have been accomplished. In the process of compilation, she made great contributions to data preprocessing, analysis, and writing. Therefore, I hope Yilin Bao and I (Xiangtian Meng) can be the rst author together. Int J Appl Earth Obs Geoinformation 89 (2020) 102111 0303-2434/ © 2020 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/BY/4.0/). T