Twitter as an indicator for whereabouts of people? Correlating Twitter
with UK census data
Enrico Steiger ⁎, René Westerholt, Bernd Resch, Alexander Zipf
GIScience Research Group, Institute of Geography, Heidelberg University, Germany
abstract article info
Article history:
Received 2 October 2014
Received in revised form 8 September 2015
Accepted 17 September 2015
Available online xxxx
Keywords:
Crowdsourcing of human activities
LBSN
Twitter
Spatial autocorrelation
Semantic topic modeling
Detailed knowledge regarding the whereabouts of people and their social activities in urban areas with high spa-
tial and temporal resolution is still widely unexplored. Thus, the spatiotemporal analysis of Location Based Social
Networks (LBSN) has great potential regarding the ability to sense spatial processes and to gain knowledge about
urban dynamics, especially with respect to collective human mobility behavior. The objective of this paper is to
explore the semantic association between georeferenced tweets and their respective spatiotemporal where-
abouts. We apply a semantic topic model classification and spatial autocorrelation analysis to detect tweets indi-
cating specific human social activities. We correlated observed tweet patterns with official census data for the
case study of London in order to underline the significance and reliability of Twitter data. Our empirical results
of semantic and spatiotemporal clustered tweets show an overall strong positive correlation in comparison
with workplace population census data, being a good indicator and representative proxy for analyzing
workplace-based activities.
© 2015 Elsevier Ltd. All rights reserved.
1. Introduction
Cities are multifunctional complex systems serving as major hubs for
a number of human social activities. With more than half of the world's
population living in urban areas and a continuing urban growth (United
Nations Population Fund, 2008), the capability to provide viable service
infrastructure (roads, public transport, energy supplies, etc.) for the ma-
jority of people is a rising challenge. The characterization of urban struc-
tures can facilitate urban and transportation planning processes
providing valuable information, which helps to predict the increased
pressure on existing urban infrastructures. Regular commuting from
workplaces to places of residence, and activities originating from these
areas, are major examples of daily routines within urban areas, influenc-
ing human mobility and affecting transportation planning. In the UK in
2013, a person on average made 145 trips with 19% of all trip purposes
related to business and commuting activities (Department for
Transport, 2014).
Determining the frequency and spatial distribution of travel origins
and destinations for every trip purpose is a principal quantitative
study area currently carried out by mobility surveys (Morris,
Humphrey, & Tipping, 2014). However, they are expensive in terms of
the required labor input and usually lead to limited sample sizes.
Thus, the investigation of typically larger spatiotemporal human activity
clusters obtained from crowdsourced information may help to
understand commuting patterns and reveal specific urban structures
such as workplace concentrations.
In this context, emerging, inexpensive and widespread sensor tech-
nologies have created new possibilities to infer mobility data for explor-
ing urban structures and dynamics. This growing availability of mobile
devices equipped with GPS sensors having broadband internet access,
allows users to actively participate and create content through mobile
applications and location-based services (ITU, 2014).
Particularly georeferenced Twitter data is a promising opportunity
to understand geographic processes inside online social networks. The
enormous potential of interactive social media platforms like Twitter
has been increasingly recognized by numerous research domains over
the last years. Although there is a growing research body using Twitter
data to analyze urban processes, empirical research towards the valida-
tion of human social activities revealing urban structures and human
mobility patterns using crowdsourced information is still widely unex-
plored (Resch, Beinat, Zipf, & Boher, 2012).
In a previous study we introduced a semantic and spatial analysis
method, through which we were able to extract human mobility flows
from uncertain Twitter data (Steiger, Ellersiek, & Zipf, 2014). However,
it remains to be investigated whether we can find similar semantic
layers that represent collective human behavior in co-occurrence with
underlying social activity.
Therefore, research question (RQ1) investigates the possibility of ex-
ploring urban structures through characterizing spatiotemporal and se-
mantic patterns of human social activities. Hence, we extract topics
covering work-related and home-related activities that reflect typical
collective human behavior (e.g., city-scale human mobility). Thus, the
Computers, Environment and Urban Systems 54 (2015) 255–265
⁎ Corresponding author at: Institute of Geography, Heidelberg University, Berliner
Straße 48, D-69120 Heidelberg, Germany.
E-mail address: enrico.steiger@geog.uni-heidelberg.de (E. Steiger).
http://dx.doi.org/10.1016/j.compenvurbsys.2015.09.007
0198-9715/© 2015 Elsevier Ltd. All rights reserved.
Contents lists available at ScienceDirect
Computers, Environment and Urban Systems
journal homepage: www.elsevier.com/locate/ceus