Soil Science Society of America Journal Soil Sci. Soc. Am. J. doi:10.2136/sssaj2018.03.0100 Received 8 Mar. 2018. Accepted 21 Aug. 2018. *Corresponding author (j.triantaflis@unsw.edu.au). © Soil Science Society of America, 5585 Guilford Rd., Madison WI 53711 USA. All Rights reserved. A Vis-NIR Spectral Library to Predict Clay in Australian Cotton Growing Soil Soil Physics & Hydrology To maintain proftability of cotton growing areas of Australia, information of nutrient management and water-use effciency are needed. In this regard, information about clay is required. This is a time-consuming and expensive laboratory analysis to undertake. An alternative is to use visible-near infrared (vis-NIR) spectroscopy, which has shown potential at different scales (e.g., local and global). Here, we predicted clay using a machine learning algorithm (Cubist) from vis-NIR acquired from topsoil (0–0.3 m) and subsurface (0.3–0.6 m) in seven cotton growing areas. The frst aim was to assess the ability of soil samples from each area to predict clay independently. The second aim was to determine the ability of the samples of six areas to predict clay in an area withheld from the calibration. The third aim was to explore the potential to improve prediction using “spiking”. The fourth was to determine how much data was necessary to establish a suitable library. We conclude that estab- lishing a calibration from each area independently was more accurate than making a calibration from six areas and predicting clay from the area withheld from the calibration. We also found that improvements in model performance were possible using spiking. When using samples from topsoil or subsur- face only, over 93 samples were required to obtain an accurate library. We also conclude that a combined dataset from topsoil and subsurface samples enabled a more consistent set of data with no loss of calibration and prediction accuracy, especially when considering the availability of calibration samples. Abbreviations: CEC, cation-exchange capacity; CV, coeffcient of variances. T he cotton growing areas in south-eastern Australia are highly productive. To maintain profitability, improve soil fertility, and water-use efficiency, the spatial distribution of key properties need to be determined (Vasu et al., 2017; Roudier et al., 2017). In this regard, clay content is informative because it controls cation-exchange capacity and water holding capacity (Meier, 1999; Aina and Periaswamy, 1985). However, generating information about clay across the large cotton growing areas is time-consuming (because of the need to acquire many samples) and cost-prohibitive (because the hydrometer method is labor intensive; requiring dispersion, sedimentation, and decanting). There is a need to develop ef- ficient and affordable laboratory methods to predict clay, to enable data generation across large spatial extents but also at the local scale where soil management occurs. While more efficient laboratory instruments like laser diffraction have been developed and shown to produce equivalent results (Fisher et al., 2017), much care and attention is required to prepare samples and maintain equipment. More re- cently, advances have been made to value-add to the clay data that can be collected using laboratory methods. This has been achieved by using a Pedometric approach, where easy and cheaper to acquire vis-NIR spectroscopy data have been coupled to laboratory measured clay (Viscarra Rossel et al., 2006; Cécillon et al., 2009) us- ing mathematical models including; partial least squares ( Ji et al., 2014), decision Dongxue Zhao Xueyu Zhao Tibet Khongnawang Maryem Arshad John Triantaflis* School of Biological, Earth and Environmental Science UNSW Sydney Kensington NSW 2052, Australia Core Ideas A vis-NIR spectral libraries were built across seven cotton growing areas of Australia. Establishing a calibration from each area independently was more accurate than making a calibration from the other six areas. Model performance improved using a spiking algorithm. Model performance improved by combining topsoil and subsurface samples. Published online November 21, 2018