Uncovering the Spatio-Temporal Structure of Social Networks using Cell Phone Records Luis G. Moyano * , Oscar R. Moll Thomae , Enrique Frias-Martinez * * Telefonica Research Madrid, Spain; Email: moyano,efm@tid.es MIT, Boston, MA, USA; Email: rimoll@mit.edu Abstract—Although research in the areas of human mobility and social networks is extensive, our knowledge of the rela- tionship between the mobility and the social network of an individual is very limited, mainly due to the complexity of accessing adequate data to be able to capture both mobility and social interactions. In this paper we present and charac- terize some of the spatio-temporal features of social networks extracted from a large-scale dataset of cell phone records. Our goal is to measure to which extent individual mobility shapes the characteristics of a social network. Our results show a non- trivial dependence between social network structure and the spatial distribution of its elements. Additionally, we quantify with detail the probability of a contact to be at a certain distance, and find that it may be described in the framework of gravity models, with different decaying rates for urban and interurban scales. Keywords-Human Mobility; Social Networks; CDR; Gravity Model; I. I NTRO Social networks have a fundamental role in our everyday lives. Our social connections determine to a great extent our daily activities, ranging from family to work, from leisure to travel. Understanding of social networks, interepreted broadly as social ties and not only in terms of network applications, has become essential due to the profound implications that their structure can have in human activities [1]. The amount, structure and type of social connections people have depend on a number of external factors such as gender, socio-economic factors, and many others. An important aspect of social networks is that they are geo- graphically embedded [2], a fact that affects and is affected by the structure of the network. The exact mechanisms of this interplay between social structure and the geographical characteristics of its units have been addressed from different points of view [3], [4], [5], [6]. However, the lack of empirical data has been a limiting factor for the validation of any attempt to describe or explain this relationship. In recent years, there has been a remarkable improvement in the deployment of pervasive infrastructures, such as mobile phones, GPS, online social networks, etc. In this context, and due to their wide coverage and high penetration, cell phones have become one of the main sensors of human behavior. As such, cell phone networks can capture both the social network of an individual (captured as cell phone calls between users) and the spatial characteristics of that individual (captured using the location of each cell phone antenna when phone calls are made). The high penetration of cell phones implies that they can capture a large amount of spatio-temporal relationships at a scale not available to other pervasive infrastructures. This opens the door to characterize how the structure of a social network is related to the mobility of the individuals that defined those social interactions. Cell phone call records are generated by telecommunica- tion operators for invoice purposes and may be gathered in datasets called Call Detailed Records (CDRs). A consider- able amount of research based on CDR analysis have mainly focused on human mobility [6], [7], [8], where the variable under study in these cases is the position of the user, with no information about the user’s social contacts. However, to address the relationship between the spatial distribution of a user and her social structure, one should focus in understanding how this structure changes in space. To this end, a central question is to uncover the probability of having a contact located at a certain distance d. This question is well suited to be explored by analyzing cell phone datasets, where the distance d between users is captured at the moment an interaction takes place, d being defined by the two cell phone towers used to deliver the call. Cell phone records contain multiple calls between a given pair of users, and each one of these calls may have associated a potentially different distance. In other types of datasets, such distance at the time of the interaction is general not available. As a result, the most frequent approach to deal with this multiplicity of distances between two individuals is to select a unique quantity to represent the distance between two users, which may be the distance between homes [5], [9], the distance between zip codes[10], the most frequently used towers [11], and other equivalent measures of average position. As a result, these studies do not consider the full extent of spatio-temporal information but a coarse-grained description, as they assume that the distance between two individuals is constant, which is not the general case. To avoid this limitation, other studies have made use of location-based social networks, where