Linking cyber and physical spaces through community detection and clustering in social media feeds Arie Croitoru a, , N. Wayant b , A. Crooks c , J. Radzikowski a , A. Stefanidis a a The Center for Geospatial Intelligence, Dept. of Geography and Geoinformation Science, George Mason University, 4400 University Drive, MS 6C3, Fairfax, VA 22030, United States b US Army Geospatial Research Laboratory, 7701 Telegraph Road, Alexandria, VA 22315-3802, United States c Dept. of Computational Social Science, George Mason University, 4400 University Drive, MS 6C3, Fairfax, VA 22030, United States article info Article history: Available online xxxx Keywords: Social media Spatiotemporal clustering Social network analysis Community detection Geospatial analysis abstract Over the last decade we have witnessed a significant growth in the use of social media. Interactions within their context lead to the establishment of groups that function at the intersection of the physical and cyber spaces, and as such represent hybrid communities. Gaining a better understanding of how information flows in these hybrid communities is a substantial scientific challenge with significant impli- cations on our ability to better harness crowd-contributed content. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we advance our understanding of the information propagation mechanisms in social media. The utility of this approach is demonstrated in two real-world case studies, the first reflecting a planned event (the Occupy Wall Street – OWS – movement’s Day of Action in November 2011), and the second reflecting an unexpected disaster (the Boston Marathon bombing in April 2013). Our findings highlight the intricate nature of the propagation and evolution of information both within and across cyber and physical spaces, as well as the role of hybrid networks in the exchange of information between these spaces. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction The past few years have witnessed the dramatic increase in the adoption and use of social media (Kaplan & Haenlein, 2010). In the U.S. alone, approximately two-thirds of online users participate in social media (Smith, 2011), spending on average between 3.6 and 6.5 h a month in social networking sites such as Facebook or Twit- ter (Nielsen, 2012). This has led to an unprecedented increase in the volume of data generated by social media users: every minute we have over 270,000 tweets (or retweets) contributed worldwide (Forbes, 2012), 3000 images posted in Flickr (Sapiro, 2011), and 100 h of video uploaded in YouTube (YouTube, 2014). These are but a few examples of the shift that has occurred in recent years toward user-generated digital content. With millions of users around the world, this trend is likely to further intensify (Hollis, 2011) as technological advances empower users to contribute richer data at higher rates. Social media services and platforms offer a wide array of digital channels for expression and interaction, ranging from forums/mes- sage boards (e.g. MacRumors), weblogs (e.g. Blogger, Wordpress), and microblogging (e.g. Twitter, Tumblr, Weibo), to wikis (e.g. Wikipedia, Wikimapia), social networking services (e.g. Facebook, Google+, LinkedIn), and podcasts (Video and Audio e.g. iTunes, Ustream). Such media have enabled the general public to contrib- ute, disseminate, and exchange information (Kaplan & Haenlein, 2010), by introducing a bottom-up alternative to complement the traditional top-down nature of Web 1.0 (Schneckenberg, 2009). This has not only resulted in a change in traditional journal- ism and news reporting (Deuze, 2008; Kwak, Lee, Park, & Moon, 2010), but it is also leading to new opportunities within the geo- graphical sciences (Caverlee, Cheng, Sui, & Kamath, 2013; Sui & Goodchild, 2011) due to the rich geographic context and context social media data often provides. A noteworthy example of this trend is the livehoods project (Cranshaw, Schwartz, Hong, & Sadeh, 2012) that is used to characterize and understand urban dynamics using social media. Indeed, social media, and micro-bog- ging in particular, have already been shown useful in predicting pandemics (Chunara, Andrews, & Brownstein, 2012; Culotta, 2010; Ritterman, Osborne, & Klein, 2009) or natural disasters http://dx.doi.org/10.1016/j.compenvurbsys.2014.11.002 0198-9715/Ó 2014 Elsevier Ltd. All rights reserved. Corresponding author. E-mail addresses: acroitor@gmu.edu (A. Croitoru), Nicole.M.Wayant@usace. army.mil (N. Wayant), acrooks2@gmu.edu (A. Crooks), jradziko@gmu.edu (J. Radzikowski), astefani@gmu.edu (A. Stefanidis). Computers, Environment and Urban Systems xxx (2014) xxx–xxx Contents lists available at ScienceDirect Computers, Environment and Urban Systems journal homepage: www.elsevier.com/locate/compenvurbsys Please cite this article in press as: Croitoru, A., et al. Linking cyber and physical spaces through community detection and clustering in social media feeds. Computers, Environment and Urban Systems (2014), http://dx.doi.org/10.1016/j.compenvurbsys.2014.11.002