TimeGeo: modeling urban mobility without travel surveys Shan Jiang a,1 , Yingxiang Yang a,1 , Siddharth Gupta a , Daniele Veneziano a , Shounak Athavale b , and Marta C. González a,c,2 a Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139; b Ford Motor Company, Dearborn, MI 48126; c Center for Advanced Urbanism, Massachusetts Institute of Technology, Cambridge, MA 02139 This manuscript was compiled on April 6, 2016 Well established fine-scale urban mobility models today depend on detailed but cumbersome and expensive travel surveys for their cali- bration. Not much is known, however, about the set of mechanisms needed to generate complete mobility profiles if only using passive data-sets with mostly sparse traces of individuals. In this study, we present a novel mechanistic modeling framework (TimeGeo) that ef- fectively generates urban mobility patterns with resolution of ten minutes and hundred of meters. It ties together the inference of home and work activity locations from data, with the modeling of flexible activities (e.g., other ) in space and time. The temporal choices are captured by only three features: the weekly home-based tour number, the dwell rate, and the burst rate. These combined gen- erate for each individual: (i) stay duration of activities, (ii) number of visited locations per day, and (iii) daily mobility networks. These parameters capture how an individual deviates from the circadian rhythm of the population, and generate the wide spectrum of empir- ically observed mobility behaviors. The spatial choices of visited locations are modeled by a rank-based exploration and preferential return (r-EPR) mechanism that incorporates space in the EPR model. Finally, we show that a hierarchical multiplicative cascade method can measure the interaction between land use and generation of trips. In this way, urban structure is directly related to the observed distance of travels. This novel framework allows us to fully embrace the massive amount of individual data generated by information and communication technologies (ICTs) worldwide to comprehensively model urban mobility without travel surveys. human mobility | urban model | mobile phone data O ur ability to correctly model urban daily activities for traffic control, energy consumption and urban plan- ning [1, 2] have critical impacts on people’s quality of life and the everyday functioning of our cities. To inform policy mak- ing of important projects such as planning a new metro line and managing the traffic demand during big events, or to pre- pare for emergencies, we need reliable models of urban travel demand. These are models with high resolution that simulate individual mobility for an entire region [3, 4]. Traditionally, inputs for such models are based on census and household travel surveys. These surveys collect information about indi- viduals (socioeconomic, demographic, etc.), their household (size, structure, relationships), and their journeys on a given day. Nonetheless, the high costs of gathering the surveys put severe limits on their sample sizes and frequencies. In most cases, they capture only 1% of the urban household popula- tion once in a decade with information of only one or few days per individual. The low sampling rate has made it very costly to infer choices of the entire urban population [3, 5–7]. More recent studies try to learn about human behavior in cities by using data collected from location-aware technologies, instead of manual surveys, to infer the preferences in travel decisions that are needed to calibrate existing choice mod- eling frameworks [8–10]. The problem, however, is that the geotagged data available from communication technologies, in the massive and low cost form, cannot inform us about the detailed activity choices of their users, making most of the data useless for meaningful urban scale mobility models. In order to make the best use of the massive and passive data, a fundamental paradigm shift is needed to model urban mobil- ity and enhance new opportunities emerging through urban computing [11]. This is our goal with TimeGeo, a modeling framework that extracts individual features and key mecha- nisms needed to effectively generate complete urban mobility profiles from the sparse and incomplete information available in telecommunication activities. Mobile phones are the prevalent communication tools of the twenty-first century, with the worldwide coverage up to 96% of the population [12]. The call detailed records (CDRs), managed by mobile phone service providers for billing pur- poses, contain information in the form of geo-located traces of users across the globe. Mobile phone data have been useful so far to improve our knowledge on human mobility at un- precedented scale, informing us about the frequency and the number of visited locations over long term observations [13– 18], daily mobility networks of individuals [15, 19], and the distribution of trip distances [13, 15, 17, 20–22]. Due to the sparse nature of mobile phone usage, these data sources have sampling biases and do not provide complete journeys in space and time for each individual [9]. Nonetheless, it has been pos- sible to extract and characterize from phone data where each . . Significance Statement Individual mobility models are important in a wide range of application areas. Current mainstream urban mobility mod- els require socio-demographic information from costly manual surveys, which are in small sample sizes and updated in low frequency. In this study, we propose a novel individual mobil- ity modeling framework, TimeGeo, that extracts required fea- tures from ubiquitous, passive, and sparse digital traces in the ICT era. The model is able to generate individual trajectories in high spatial-temporal resolutions, with interpretable mech- anisms and parameters capturing heterogeneous individual travel choices. The modeling framework can flexibly adapt to input data with different resolutions, and be further extended for various modeling purposes. S.J., Y.Y., and M.C.G. designed research; S.J., Y.Y., S.G., D.V., S.A., and M.C.G. performed re- search; S.J., Y.Y., and S.G. analyzed data; S.J., Y.Y., D.V.,and M.C.G. wrote the paper. 1 S.J. and Y.Y. contributed equally to this work. 2 To whom correspondence should be addressed. E-mail: martag@mit.edu www.pnas.org/cgi/doi/10.1073/pnas.XXXXXXXXXX PNAS | April 6, 2016 | vol. XXX | no. XX | 1–12