RESEARCH ARTICLE How to Estimate Epidemic Risk from Incomplete Contact Diaries Data? Rossana Mastrandrea 1,2 , Alain Barrat 1,3 * 1 Aix Marseille Univ, Univ Toulon, CNRS, CPT, Marseille, France, 2 IMT Institute of Advanced Studies, Lucca, Lucca, Italy, 3 Data Science Laboratory, ISI Foundation, Torino, Italy * alain.barrat@cpt.univ-mrs.fr Abstract Social interactions shape the patterns of spreading processes in a population. Techniques such as diaries or proximity sensors allow to collect data about encounters and to build net- works of contacts between individuals. The contact networks obtained from these different techniques are however quantitatively different. Here, we first show how these discrepancies affect the prediction of the epidemic risk when these data are fed to numerical models of epi- demic spread: low participation rate, under-reporting of contacts and overestimation of con- tact durations in contact diaries with respect to sensor data determine indeed important differences in the outcomes of the corresponding simulations with for instance an enhanced sensitivity to initial conditions. Most importantly, we investigate if and how information gath- ered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth. The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor net- work. We present and compare several methods to build such surrogate data, and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by pub- licly available data describing the heterogeneity of the durations of human contacts. Author Summary Schools, offices, hospitals play an important role in the spreading of epidemics. Informa- tion about interactions between individuals in such contexts can help understand the pat- terns of transmission and design ad hoc immunization strategies. Data about contacts can be collected through various techniques such as diaries or proximity sensors. Here, we first ask if the corresponding datasets give similar predictions of the epidemic risk when they are used to build a network of contacts among individuals. Not surprisingly, the answer is negative: indeed, if we consider data from sensors as the ground truth, diaries are affected by low participation rate, underreporting and overestimation of durations. Is it however PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005002 June 24, 2016 1 / 19 a11111 OPEN ACCESS Citation: Mastrandrea R, Barrat A (2016) How to Estimate Epidemic Risk from Incomplete Contact Diaries Data? PLoS Comput Biol 12(6): e1005002. doi:10.1371/journal.pcbi.1005002 Editor: Marcel Salathé, Ecole Polytechnique Federale de Lausanne, SWITZERLAND Received: January 15, 2016 Accepted: May 25, 2016 Published: June 24, 2016 Copyright: © 2016 Mastrandrea, Barrat. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: Data can be downloaded from the dedicated webpage http://www. sociopatterns.org/datasets/. Funding: This work was supported by the A MIDEX project (ANR-11-IDEX-0001-02) funded by the "Investissements dAvenir" French Government program, managed by the French National Research Agency (ANR), to AB and RM. AB is also partially supported by the French ANR project HarMS-flu (ANR-12-MONU-0018) and by the EU FET project Multiplex 317532. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.