Large-Scale Measurement of Aggregate Human Colocation Patterns for Epidemiological Modeling Shankar Iyer* (a) , Brian Karrer (a) , Daniel Citron (a) , Farshad Kooti (a) , Paige Maas (a) , Zeyu Wang (b) , Eugenia Giraudy (a) , P. Alex Dow (a) , Alex Pompe (a) Abstract To understand and model public health emergencies, epidemiologists need data that describes how humans are moving and interacting across physical space. Such data has traditionally been difficult for researchers to obtain with the temporal resolution and geographic breadth that is needed to study, for example, a global pandemic. This paper describes Colocation Maps, which are spatial network datasets that have been developed within Facebook’s Data For Good program. These Maps estimate how often people from different regions are colocated: in particular, for a pair of geographic regions x and y, these Maps estimate the probability that a randomly chosen person from x and a randomly chosen person from y are simultaneously located in the same place during a randomly chosen minute in a given week. These datasets are well suited to parametrize metapopulation models of disease spread or to measure temporal changes in interactions between people from different regions; indeed, they have already been used for both of these purposes during the COVID-19 pandemic. In this paper, we show how Colocation Maps differ from existing data sources, describe how the datasets are built, provide examples of their use in compartmental modeling, and summarize ideas for further development of these and related datasets. We also conduct the first large-scale analysis of human colocation patterns across the world. Among the findings of this study, we observe that a pair of regions can exhibit high colocation despite few people moving between them. We also find that although few pairs of people are colocated for many days over the course of a week, these pairs can contribute significant fractions of the total colocation time within a region or between pairs of regions. Keywords Human Mobility — Epidemiology — Dataset Release (a) Facebook, Menlo Park, California, United States (b) Department of Economics, Stanford University, Stanford, California, United States *Corresponding author: shankar94@fb.com Contents 1 Introduction 1 2 Materials and Methods 2 2.1 Data Sources ......................... 2 2.2 Dataset Preparation ..................... 3 3 Applications of Colocation Maps 6 3.1 An Example Metapopulation Model .......... 6 3.2 Colocation vs. Movement ................. 7 3.3 Applications of Colocation Maps to Research on COVID-19 ............................ 7 4 Assumptions of Colocation Maps 8 4.1 Representativeness ..................... 8 4.2 Within-Region Colocation vs. Between-Region Colo- cation ................................ 9 4.3 Contact Heterogeneity .................. 10 4.4 Heterogeneity over Time ................ 11 4.5 Colocation vs. Face-to-Face Contact ........ 11 5 Future Directions 12 Acknowledgments 13 References 13 Supplementary Material 15 1. Introduction The worldwide use of mobile phones generates rich data de- scribing human mobility, and epidemiologists have empha- sized that this data can be vitally important for understanding the spread of infectious disease [1, 2, 3]. During the ongoing COVID-19 pandemic, such data has been used to parametrize compartmental models [4, 5] and to measure the causal im- pact of mobility-oriented interventions [6, 7, 8, 9, 10, 11]. However, this data is usually only available to researchers for specific parts of the world at specific times, through limited All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 18, 2020. ; https://doi.org/10.1101/2020.12.16.20248272 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.