Use of Twitter for Analysis of Public Sentiment for Improvement of Local Government Service Yohei Seki Faculty of Library, Information and Media Science University of Tsukuba Ibaraki, Japan 305–8550 Email: yohei@slis.tsukuba.ac.jp Abstract—Active collaboration with the public is the key to improving the administrative services of local government, and social media is an essential tool for understanding public sentiment. In this paper, we propose a method to gauge public sentiment for local government by using Twitter. We conducted an operational test with local government officers and found that our proposal was effective in revealing sentiments that were meaningful for the public, such as avoiding full parking lots, finding potholes in the road, or attracting favorable comments or suggestions to improve festival events. 1. Introduction In recent years, civic technology has enabled public engagement and improvements in local government infras- tructure and has thus attracted increasing attention. One prominent example is Code for America 1 established by Jennifer Pahlka in 2009. This organization made changes to make government resources more available to the gen- eral public. Another important concept is Open311 2 , which proposed open source CRM, that is, a collaborative model for civic issue tracking to report nonemergency issues such as graffiti, potholes, and street cleaning directly to local government organizations (cities). On the other hand, the information retrieval or data min- ing community focused on civic data mining studies, as in the ACM SIGIR tutorial Searching in the City of Knowledge [1], and in workshops such as Information Access in Smart Cities [2] and Mining Urban Data 3 . One attractive potential application here is collecting the sentiments of residents in the local district from social media to reduce administrative costs. In the sentiment analysis research field [3], many practical services and applications for information analysis and public opinion surveys have also been developed, using product or movie reviews, personal blogs, and microblogs. Sophisticated sentiment analysis techniques usually depend on the lexicon in a specific domain, and the topics range over a variety of domains found in personal consumer-generated media (CGM). 1. https://www.codeforamerica.org 2. http://www.open311.org/ 3. http://www.insight-ict.eu/mud2 In this study, we propose a framework for collecting and analyzing public sentiment in a local district, to promote favorable spots and facilities in the district, or to reduce the administrative costs of the local government. For this purpose: 1) We collected the social media (Twitter) users who live in the local district. This approach can be applied to other geolocated social media services: Instagram, Foursquare, Panoramio, etc. 2) We flexibly extracted the sentiment clues in the multiple domains relevant to, for example, a festival event or an administrative service. For the first step, we collected Twitter users in the local district using profiles, and extended them using their follow- ers. For the second step, we used Word2Vec [4] to flexibly extract sentiment clues by using term similarity based on co-occurrences of positive/negative terms according to the domain. The structure of this paper is as follows. In Section 2, we introduce related work. In Section 3, we describe our proposal. In Section 4, we discuss our experiments. We present our conclusions in Section 5. 2. Related Work 2.1. Social User Residence Estimation To estimate social media users’ district of residence, previous work has proposed content-based approaches and graph-based approaches. For the former, Cheng et al. [5] estimated the residence by extracting local terms that were biased to the local districts. The drawback of this approach is that the local terms are sometimes used by social users who live in a different district and refer to a variety of topics, in particular when the estimated district is in a small area. In this study, because the target district was the local municipality level, we adopted the graph-based approach. Jurgens [6] proposed a graph-based approach to estimate Twitter users’ residence by focusing on their follow/follower relationships. He estimated the seed users’ residence based on their geo-tagged tweets. However, the 978-1-5090-0898-8/16/$31.00 c ⃝2016 IEEE