Systematic land cover bias in Collection 5 MODIS cloud mask and derived products A global overview Adam M. Wilson a, , Benoit Parmentier b , Walter Jetz a a Department of Ecology and Evolutionary Biology, Yale University, 165 Prospect St, New Haven, CT, USA b National Center for Ecological Analysis and Synthesis, 735 State Street, Suite 300, Santa Barbara, CA, USA abstract article info Article history: Received 26 July 2013 Received in revised form 24 October 2013 Accepted 26 October 2013 Available online 23 November 2013 Keywords: MODIS Cloud detection Land cover Net primary productivity Land surface temperature Bias Validation Identifying cloud interference in satellite-derived data is a critical step toward developing useful remotely sensed products. Most MODIS land products use a combination of the MODIS (MOD35) cloud mask and the internal cloud mask of the surface reectance product (MOD09) to mask clouds, but there has been little discussion of how these masks differ globally. We calculated global mean cloud frequency for both products, for 2009, and found that inated proportions of observations were agged as cloudy in the Collection 5 MOD35 product. These erroneously categorized areas were spatially and environmentally non-random and usually occurred over high-albedo land cover types (such as grassland and savanna) in several regions around the world. Addi- tionally, we found that spatial variability in the processing path applied in the Collection 5 MOD35 algorithm af- fects the likelihood of a cloudy observation by up to 20% in some areas. These factors result in abrupt transitions in recorded cloud frequency across land cover and processing-path boundaries impeding their use for ne-scale spatially contiguous modeling applications. We show that together, these artifacts have resulted in signicantly decreased and spatially biased data availability for Collection 5 MOD35-derived composite MODIS land products such as land surface temperature (MOD11) and net primary productivity (MOD17). Finally, we compare our re- sults to mean cloud frequency in the new Collection 6 MOD35 product, and nd that land cover artifacts have been reduced but not eliminated. Collection 6 thus increases data availability for some regions and land cover types in MOD35-derived products but practitioners need to consider how the remaining artifacts might affect their analysis. © 2013 Elsevier Inc. All rights reserved. 1. Introduction Identifying the presence of clouds, which cover 70% of the Earth's surface (Stubenrauch et al., 2013), is a difcult yet vital step in develop- ing products that accurately reect land surface phenomena from re- motely sensed data (cf. Moody, King, Platnick, Schaaf, & Gao, 2005; Platnick et al., 2003). MODIS land products contain pixel-level ags that indicate cloud interference derived from two cloud detection algo- rithms: MOD35, which was developed by the MODIS atmosphere team (Ackerman et al., 1998; Ackerman et al., 2010), and another within the PGE11 program used to generate the MOD09 surface reectance prod- uct (Vermote, Kotchenova, & Ray, 2011). The MOD35 cloud mask uses up to 22 of the 36 MODIS bands, eco- system type and other environmental data in a suite of tests to identify the presence of clouds or other obstructions. The suite of tests applied in a pixel depends on the processing path,which is designed to account for spectral differences associated with land cover and the associated variability in albedo. For example, detecting clouds over a forest re- quires a different set of tests and thresholds than detecting it over a glacier. In the Collection 5 MOD35, the four processing paths (water, coast,’‘land,or desert) were designated using the AVHRR-derived Olson 1-km Global Land Cover Characteristics Data Base Version 2.0 (Loveland et al., 2000), though the Olson land cover classes included in landand desertvaried globally (pers comm. Richard Frey). Thus, processing pathshould be thought of as the suite of cloud tests and thresholds applied in each pixel rather than as a land cover type. Once identied, the selected processing path is applied to every swath-level MODIS observation within a MODIS collection. This means that pixels with different processing paths will be subjected to different sets of cloud detection tests and thresholds, even though those pixels may be adjacent. In contrast, the internal MOD09 cloud mask uses only two re- ective tests and a thermal test to identify clouds (Frey, 2010). The two reective tests are designed to be complementary, with one to ag low or high reective clouds and the other to catch high clouds even if they have low reectivity. When one or both of these algorithms identify a pixel as cloudy, the pixel is typically removed or weighted differently in further processing in MODIS land products. For example, the commonly used 16-day com- posite vegetation index product (MOD13) uses both the MOD35 and MOD09 cloud ags, along with other data, to select between available observations and compositing algorithms. The cloud ags are used to Remote Sensing of Environment 141 (2014) 149154 Corresponding author. Tel.: +1 240 979 7404. E-mail address: Adam.wilson@yale.edu (A.M. Wilson). 0034-4257/$ see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.rse.2013.10.025 Contents lists available at ScienceDirect Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse