Generation of ﬁne-scale population layers using multi-resolution satellite imagery and geospatial data ☆ Derek Azar a , Ryan Engstrom a, b , Jordan Graesser a , Joshua Comenetz a, ⁎ a U.S. Census Bureau, 4600 Silver Hill Road, Washington, DC 20233, United States b Department of Geography, The George Washington University, 1922 F Street NW (Old Main), Washington, DC 20052, United States abstract article info Article history: Received 2 March 2012 Received in revised form 19 November 2012 Accepted 23 November 2012 Available online 29 December 2012 Keywords: Population distribution Built-up area Pakistan Dasymetric CART A gridded population dataset was produced for Pakistan by developing an algorithm that distributed popula- tion either on the basis of per-pixel built-up area fraction or the per-pixel value of a weighted population likelihood layer. Per-pixel built-up area fraction was calculated using a classiﬁcation and regression trees (CART) methodology integrating high- and medium-resolution satellite imagery. The likelihood layer was produced by weighting different geospatial layers according to their effect on the likelihood of population being found in the particular pixel. The geospatial layers integrated into the likelihood layer were: 1) prox- imity to remotely sensed built-up pixels, 2) density of settlement points in a ﬁxed kernel, 3) slope, 4) eleva- tion, and 5) heterogeneity of landcover types found within a search radius. The method for weighting these layers varied according to settlement patterns found in the provinces of Pakistan. Differences in zonal popu- lation estimates generated from the 100-meter gridded population layer resulting from this study, Oak Ridge National Laboratory's LandScan (2002), and CIESIN's Gridded Population of the World and Global Rural Urban Mapping Project (GPW and GRUMP) are examined. Population estimates for small areas produced using this paper's method were found to differ from census counts to a lesser degree than those produced using LandScan, GPW, or GRUMP. The root mean square error (RMSE) for small area population estimates for this method, LandScan, GPW, and GRUMP were 31,089, 48,001, 100,260, and 72,071, respectively. Published by Elsevier Inc. 1. Introduction Readily available and accurate data on spatial population distribu- tion have multiple uses for humanitarian relief, disaster response planning, and development assistance. To assist in these areas, the U.S. Census Bureau, using remotely sensed imagery and population census data, created a highly detailed population distribution dataset for Pakistan. This work is part of the Census Bureau's ongoing Populations at Risk initiative, which follows a 2007 National Research Council (NRC) report that called for the Census Bureau to produce high-resolution, subnational estimates for those populations that are at risk of exposure to natural disasters and complex humanitarian emergencies. Population data are a fundamental component of both policymaking and disaster response. In order to be useful, however, baseline data must be timely, detailed, and spatially enabled (Noji, 2005). These criteria can be met by reliable and recent population censuses and surveys. Many countries, however, including some prone to natural disasters and humanitarian emergencies, lack recent census and survey data. Additionally, the lack of a link between population data and digital geographic boundaries can further reduce the availability of informa- tion on populations at risk. In this case, data tables may exist for small areas at lower levels of administrative geography. However, without a link between digital maps and demographic data, users are limited to what can be gleaned from place names found within the census or sur- vey. Researchers have recognized for over a decade that the technical difﬁculty linking human geography and data produced from remotely sensed sources is a barrier to the increased use of satellite data (Rindfuss & Stern, 1998). The absence of population data, either because the data are not collected or lack useful accompanying geographical data, is a formidable obstacle to policymaking and humanitarian re- sponse in parts of the developing world (NRC, 2007). Satellite imagery can be collected over large areas at relatively low cost compared to the price of conducting a nationally representative population census or household survey. The idea of exploiting the synoptic coverage of satellite imagery to compensate for the uneven temporal and spatial resolution of census data is not new in the geo- graphic literature. Areas of human activity appear to the naked eye readily in remotely sensed imagery. Surface characteristics of built- up areas make them recognizable by a human interpreter. For exam- ple, built-up areas tend to be bright in all visible bands and tend to be highly textured relative to most natural surfaces. Human observers Remote Sensing of Environment 130 (2013) 219–232 ☆ This paper is released to inform interested parties of research and to encourage discus- sion. Any views expressed on statistical, methodological, or technical issues are those of the author(s) and not necessarily those of the U.S. Census Bureau. ⁎ Corresponding author. Tel.: +1 301 763 1408; fax: +1 301 763 2516. E-mail address: Joshua.Comenetz@census.gov (J. Comenetz). 0034-4257/$ – see front matter. Published by Elsevier Inc. http://dx.doi.org/10.1016/j.rse.2012.11.022 Contents lists available at SciVerse ScienceDirect Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse Author's Personal Copy