Systematic land cover bias in Collection 5 MODIS cloud mask and derived products — A global overview Adam M. Wilson a, ⁎, Benoit Parmentier b , Walter Jetz a a Department of Ecology and Evolutionary Biology, Yale University, 165 Prospect St, New Haven, CT, USA b National Center for Ecological Analysis and Synthesis, 735 State Street, Suite 300, Santa Barbara, CA, USA abstract article info Article history: Received 26 July 2013 Received in revised form 24 October 2013 Accepted 26 October 2013 Available online 23 November 2013 Keywords: MODIS Cloud detection Land cover Net primary productivity Land surface temperature Bias Validation Identifying cloud interference in satellite-derived data is a critical step toward developing useful remotely sensed products. Most MODIS land products use a combination of the MODIS (MOD35) cloud mask and the ‘internal’ cloud mask of the surface reﬂectance product (MOD09) to mask clouds, but there has been little discussion of how these masks differ globally. We calculated global mean cloud frequency for both products, for 2009, and found that inﬂated proportions of observations were ﬂagged as cloudy in the Collection 5 MOD35 product. These erroneously categorized areas were spatially and environmentally non-random and usually occurred over high-albedo land cover types (such as grassland and savanna) in several regions around the world. Addi- tionally, we found that spatial variability in the processing path applied in the Collection 5 MOD35 algorithm af- fects the likelihood of a cloudy observation by up to 20% in some areas. These factors result in abrupt transitions in recorded cloud frequency across land cover and processing-path boundaries impeding their use for ﬁne-scale spatially contiguous modeling applications. We show that together, these artifacts have resulted in signiﬁcantly decreased and spatially biased data availability for Collection 5 MOD35-derived composite MODIS land products such as land surface temperature (MOD11) and net primary productivity (MOD17). Finally, we compare our re- sults to mean cloud frequency in the new Collection 6 MOD35 product, and ﬁnd that land cover artifacts have been reduced but not eliminated. Collection 6 thus increases data availability for some regions and land cover types in MOD35-derived products but practitioners need to consider how the remaining artifacts might affect their analysis. © 2013 Elsevier Inc. All rights reserved. 1. Introduction Identifying the presence of clouds, which cover 70% of the Earth's surface (Stubenrauch et al., 2013), is a difﬁcult yet vital step in develop- ing products that accurately reﬂect land surface phenomena from re- motely sensed data (cf. Moody, King, Platnick, Schaaf, & Gao, 2005; Platnick et al., 2003). MODIS land products contain pixel-level ﬂags that indicate cloud interference derived from two cloud detection algo- rithms: MOD35, which was developed by the MODIS atmosphere team (Ackerman et al., 1998; Ackerman et al., 2010), and another within the PGE11 program used to generate the MOD09 surface reﬂectance prod- uct (Vermote, Kotchenova, & Ray, 2011). The MOD35 cloud mask uses up to 22 of the 36 MODIS bands, eco- system type and other environmental data in a suite of tests to identify the presence of clouds or other obstructions. The suite of tests applied in a pixel depends on the ‘processing path,’ which is designed to account for spectral differences associated with land cover and the associated variability in albedo. For example, detecting clouds over a forest re- quires a different set of tests and thresholds than detecting it over a glacier. In the Collection 5 MOD35, the four processing paths (‘water,’ ‘coast,’‘land,’ or ‘desert’) were designated using the AVHRR-derived Olson 1-km Global Land Cover Characteristics Data Base Version 2.0 (Loveland et al., 2000), though the Olson land cover classes included in ‘land’ and ‘desert’ varied globally (pers comm. Richard Frey). Thus, “processing path” should be thought of as the suite of cloud tests and thresholds applied in each pixel rather than as a land cover type. Once identiﬁed, the selected processing path is applied to every swath-level MODIS observation within a MODIS collection. This means that pixels with different processing paths will be subjected to different sets of cloud detection tests and thresholds, even though those pixels may be adjacent. In contrast, the internal MOD09 cloud mask uses only two re- ﬂective tests and a thermal test to identify clouds (Frey, 2010). The two reﬂective tests are designed to be complementary, with one to ﬂag low or high reﬂective clouds and the other to catch high clouds even if they have low reﬂectivity. When one or both of these algorithms identify a pixel as cloudy, the pixel is typically removed or weighted differently in further processing in MODIS land products. For example, the commonly used 16-day com- posite vegetation index product (MOD13) uses both the MOD35 and MOD09 cloud ﬂags, along with other data, to select between available observations and compositing algorithms. The cloud ﬂags are used to Remote Sensing of Environment 141 (2014) 149–154 ⁎ Corresponding author. Tel.: +1 240 979 7404. E-mail address: Adam.wilson@yale.edu (A.M. Wilson). 0034-4257/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.rse.2013.10.025 Contents lists available at ScienceDirect Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse