Analyzing the Quality of Information Solicited from Targeted Strangers on Social Media Jeffrey Nichols * , Michelle X. Zhou * , Huahai Yang * , Jeon Hyung Kang † , Xiaohua Sun ‡ * IBM Research – Almaden 650 Harry Road San Jose, CA 95120 {jwnichols,mzhou,hyang}@us.ibm.com † USC ISI 4676 Admiralty Way Marina del Rey, CA 90292 jeonhyuk@usc.edu ‡ Tongji University Shanghai, China xsun@tongji.edu.cn ABSTRACT The emergence of social media creates a unique opportunity for developing a new class of crowd-powered information collection systems. Such systems actively identify potential users based on their public social media posts and solicit them directly for information. While studies have shown that users will respond to solicitations in a few domains, there is little analysis of the quality of information received. Here we explore the quality of information solicited from Twitter users in the domain of product reviews, specifically reviews for a popular tablet computer and L.A.-based food trucks. Our results show that the majority of responses to our questions (>70%) contained relevant information and often provided additional details (>37%) beyond the topic of the question. We compare the solicited Twitter reviews to other user-generated reviews from Amazon and Yelp, and found that the Twitter answers provided similar infor- mation when controlling for the questions asked. Our re- sults also reveal limitations of this new information collection method, including its suitability in certain do- mains and potential technical barriers to its implementation. Our work provides strong evidence for the potential of this new class of information collection systems and design implications for their future use. Author Keywords Social Q&A; crowdsourcing; Twitter; product reviews ACM Classification Keywords H.5.2 General Terms Algorithms; Experimentation; Human Factors INTRODUCTION Hundreds of millions of people express themselves every day on public social media, such as Twitter. This creates a unique opportunity for building a brand new class of crowd-powered information collection systems, which ac- tively solicit information from the right people at the right time based on their public social media posts. For example, if a person just tweeted about getting a sandwich from a food truck, such a system can ask her to provide additional details about her experience. This approach offers several advantages over other crowd-powered information collec- tion systems, such as social Q&A [3]. First, it can collect information about an event, such as a robbery, soon after that event occurred. Second, it can collect information from people who are most likely to share their “visceral reaction” to an event [4]. Third, it can also collect information from a range of people across a particular dimension of a popula- tion (e.g., liberal vs. conservative). Although this approach and its feasibility have been demonstrated in a few domains [1, 12], there are still many unknowns about the approach that need to be explored. One unknown is the level of information quality obtainable through this collection method. While there is abundant research effort on studying the quality of crowd-sourced information, especially in the form of social Q&A systems [5, 8, 9, 10, 16, 17], there are differences in this new approach that may influence the outcomes: • Answers are actively solicited from strangers, who have not opted-in a priori and most likely have no social or organizational ties to the question asker. This may causes strangers to be less likely to respond, but it may also cause the responses that are received to be more objective and balanced, because the information providers are not self-selected and may have fewer in- trinsic motivations for providing information [4]. • Information exchange occurs mainly between the asker and answerer, without any moderation by a larger group. This removes the potential reputation and filter- ing benefits of typical Social Q&A sites, like Quora, to govern the quality of crowd-sourced information. • Potential information providers are chosen based on their social media content, which may be misleading about their true ability to provide quality information. To rigorously assess the quality of information collected using this approach, we have designed and conducted a set of experiments that focus on two aspects. First, we focus on analyzing the quality of crowd-sourced information collect- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CSCW ’13, February 23–27, 2013, San Antonio, Texas, USA. Copyright 2013 ACM 978-1-4503-1331-5/13/02...$15.00. Social Media Analysis and Interventions February 23–27, 2013, San Antonio, TX, USA 967