Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review * Deyi Hou a, * , David O'Connor a , Paul Nathanail b , Li Tian c , Yan Ma d a School of Environment, Tsinghua University, Beijing, 100084, China b School of Geography, University of Nottingham, Nottingham, NG7 2RD, UK c Department of Urban Planning, School of Architecture, Tsinghua University, Beijing, 100084, China d School of Chemical and Environmental Engineering, China University of Mining & Technology, Beijing 100083, China article info Article history: Received 4 April 2017 Received in revised form 4 July 2017 Accepted 7 July 2017 Available online xxx Keywords: GIS Kriging Multivariate statistical analysis Principal component analysis Cluster analysis abstract Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0e0.10 m, or 0e0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identied several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthro- pogenic sources of heavy metals were found to derive from lithogenic origin, roadway and trans- portation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specic anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). © 2017 Elsevier Ltd. All rights reserved. 1. Introduction Whilst it is acknowledged that there is no authoritative deni- tion of the term heavy metalsto be found in the relevant literature (Duffus, 2002), the present study uses the term as a group name for metals and semimetals (metalloids) that have been associated with soil contamination and potential toxicity or ecotoxicity. The heavy metals that have been most intensively studied within the reviewed publications include Pb, Zn, Cu, Ni, Cr, and Cd, listed in descending order of frequency. Heavy metal contamination in soil has become a serious issue globally (Jarup, 2003; Jingling et al., 2016; Phoungthong et al., 2016). Harmful amounts of heavy metals can enter the human body from contaminated soil via exposure pathways such as direct or indirect ingestion, inhalation and dermal contact; potentially resulting in human health effects. Heavy metals can also exhibit ecotoxicity leading to inhibited ecological health in addition to bioaccumulation in the food chain. To address this issue, the fate and transport of heavy metals in soil, as well as the remediation of contaminated soils, has been inten- sively studied (Hou et al., 2016; Ma et al., 2015; Tsang and Lo, 2006; Tsang et al., 2007, 2009). It is also of upmost importance to be able to robustly discern the spatial distribution of heavy metals in soils at the regional scale, in order to enable sound assessment of human and ecological risks, and to implement efcient pollution * This paper has been recommended for acceptance by Dr. Yong Sik Ok. * Corresponding author. E-mail address: houdeyi@tsinghua.edu.cn (D. Hou). Contents lists available at ScienceDirect Environmental Pollution journal homepage: www.elsevier.com/locate/envpol http://dx.doi.org/10.1016/j.envpol.2017.07.021 0269-7491/© 2017 Elsevier Ltd. All rights reserved. Environmental Pollution xxx (2017) 1e13 Please cite this article in press as: Hou, D., et al., Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review, Environmental Pollution (2017), http://dx.doi.org/10.1016/j.envpol.2017.07.021