In or Out?
Real-Time Monitoring of BREXIT sentiment on Twitter
Laurentiu Vasiliu
Peracton Ltd.
Dublin
Ireland
laurentiu.vasiliu@peracton.com
André Freitas, Frederico Caroli,
Siegfried Handschuh
University of Passau
Germany
{first.last}@uni-passau.de
Ross McDermott, Manel
Zarrouk, Manuela
Hürlimann, Brian Davis,
Tobias Daudert, Malek Ben
Khaled, David Byrne
Insight Centre for Data Analytics
National University of Ireland
Galway, Ireland
{first.last}@insight-centre.org
Sergio Fernández
Redlink GmbH
Salzburg
Austria
sergio.fernandez@redlink.co
Angelo Cavallini
3rdPlace S.r.l.
Milan
Italy
angelo.cavallini@3rdplace.com
ABSTRACT
The SSIX (Social Sentiment analysis financial IndeXes) project is
a European Innovation Project sponsored by the European
Commission under the Horizon 2020 framework. SSIX aims to
provide European SMEs with a collection of easy to interpret
tools to analyse and understand social media sentiment for any
given topic regardless of locale or language. The United
Kingdom’s recent referendum on European Union membership
i.e. staying (“Bremain”) or leaving the EU (“Brexit”) was selected
for the initial real-world test case for the validating the SSIX
methodology and platform. In this paper, we describe the SSIX
architecture in brief as well as analysis of the platforms X-Scores
metrics and their application to Brexit, our initial experimental
results and lessons learned.
CCS Concepts
• Computing methodologies➝ Artificial intelligence➝ Natural
language processing➝ Information extraction.
• Computing methodologies➝ Machine learning➝ Learning
paradigms➝ Supervised learning by classification.
Keywords
SSIX; Brexit; Natural Language Processing; Machine Learning;
Opinion Mining; Twitter; Sentiment Analysis; Political Opinion
Mining.
1. INTRODUCTION
The SSIX (Social Sentiment analysis financial IndeXes) project
1
is European Innovation Project sponsored by European
Commission under the Horizon 2020 framework. SSIX aims to
provide European SMEs with a collection of easy to interpret
tools to analyse and understand social media users opinion for any
given topic regardless of locale or language. The SSIX platform
interprets significant sentiment signals in social media
conversations producing sentiment metrics, such as sentiment
dynamics, sentiment volatility and sentiment momentum.
1
http://ssix-project.eu/
The recent United Kingdom European Union membership
referendum on staying (“Bremain”) or leaving the EU (“Brexit”)
was chosen as a first real-world test case for the SSIX consortium
[1]. The goal was to stress test the SSIX platform and the
methodology we have employed in order to infer
opinion/sentiment from social networks. Furthermore, we
employed the analysis of a set of rolling metrics called X-Scores,
such as the raw aggregated sentiment, volumes, rolling averages
and non-standard technical oscillators such as relative strength
index (RSI) to examine their value for providing insights into
sentiment behaviour. These initials tests enabled us to examine for
the first time the SSIX platform in a real world scenario and
provided extremely valuable feedback about both the behaviour of
the technology we have employed for it and our fundamental
assumptions on extracting sentiment data from social networks,
which will be for various use cases, primarily for decision-
making.
2. ASSUMPTIONS AND SSIX
ARCHITECTURE
As originally foreseen, the SSIX project aims to cover the most
important social networks such as Facebook, Twitter and
LinkedIn. For the Brexit exercise, we started with Twitter only
due to technical accessibility reasons. We note that Twitter users
will not overlap exactly with the voting demographics in the UK
but only a portion of it [2]. Moreover, it was not easy to identify
what constitutes ‘overlap’ since many users do not disclose
publicly their location of tweeting or residence.
However, we attempt to curtail this by, capturing English
messages only. Overall, 40% of all activity can be said to come
from geographical Europe (this includes GMT etc. time zones
which cannot be attributed to a single country), while 18% comes
from outside Europe. For 42% it was not possible to determine
their location because the time zone is not set. Next, we present
the location and percentage of sentiment expressed on those
locations from Twitter users for some European
2
countries. This
data represents only 33% (2.3 million Tweets out of a total of 5.9
million) from the entire data collection. Note, not all users enable
their location data so it was not possible to capture this
information fully.
2
European here has the geographical meaning, EU and non-EU.
© 2016 Copyright held by the author/owner(s).
SEMANTICS 2016: Posters and Demos Track
September 13-14, 2016, Leipzig, Germany