Unsupervised Discovery of Opposing Opinion Networks From Forum Discussions Yue Lu Twitter Inc. 1355 Market St. Suit 900 San Francisco, CA 94103 yuelu@twitter.com Hongning Wang, ChengXiang Zhai, Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61820 {wang296, czhai, danr}@illinois.edu ABSTRACT With more and more people freely express opinions as well as ac- tively interact with each other in discussion threads, online forums are becoming a gold mine with rich information about people’s opinions and social behaviors. In this paper, we study an inter- esting new problem of automatically discovering opposing opin- ion networks of users from forum discussions, which are subset of users who are strongly against each other on some topic. Toward this goal, we propose to use signals from both textual content (e.g., who says what) and social interactions (e.g., who talks to whom) which are both abundant in online forums. We also design an opti- mization formulation to combine all the signals in an unsupervised way. We created a data set by manually annotating forum data on five controversial topics and our experimental results show that the proposed optimization method outperforms several baselines and existing approaches, demonstrating the power of combining both text analysis and social network analysis in analyzing and generat- ing the opposing opinion networks. Categories and Subject Descriptors H.3.m [Information Storage and Retrieval]: Miscellaneous; I.2.6 [Artificial Intelligence]: Learning General Terms Algorithms, Experimentation Keywords opinion analysis, social network analysis, optimization, online fo- rums, linear programming 1. INTRODUCTION Online forum is one of the early applications managing and pro- moting user generated content. Although being simple in its de- sign – users carry out discussion in the form of message threads, forums remain prevalent and popular even during the recent rise of many sophisticated Web 2.0 applications. As users actively ex- press their opinions and exchange their knowledge on all kinds of topics/issues, e.g., technology, sports, religion, and politics, forums are becoming a great source for opinion mining. However, the sim- ple design of forums combined with rapidly accumulated data make it challenging to make sense out of the forum discussions. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CIKM’12, October 29–November 2, 2012, Maui, HI, USA. Copyright 2012 ACM 978-1-4503-1156-4/12/10 ...$10.00. Supporting Group Against Group Woman who gets an abortion should get life in prison. Abortions will never be illegal No criminal punishment for a woman who gets an abortion. It’s a form of population control. Figure 1: Example Opposing Opinion Network for the Thread on “Abortion” In this paper, we study an interesting new problem of automat- ically discovering opposing opinion networks from forum discus- sions, which are defined as latent user groups with strong opposing opinions on different topics, i.e., a supporting group and an against group. There is an example illustration in Figure 1 about opposing user groups on the topic “Abortion”. We can see that such discov- ered opinion networks can serve as a concise and interesting sum- mary of the topics and users in the forum discussions. They can also provide a sense of “virtual community” for the online users, to help them engage in the forum activities more easily. Once we have identified the latent opposing opinion networks, they can enable a number of interesting applications that add social components to forums. For example, we can detect semantically similar topics which involve similar groups of opposing users. We can also find users of similar minds who often agree with each other across dif- ferent topics or “enemy” users who are often against each other across different topics. Discovering the opposing opinion networks is related to some existing work on opinion mining, which we will review in detail in Section 6. In short, our work is distinguished from existing work because we exploit the unique characteristics of forum data in an unsupervised way: combining signals from both textual content (e.g. who said what) and social interactions (e.g. who talks to whom). More specifically, from the textual content analysis per- spective, we propose two kinds of analysis: (1) topic model anal- ysis of aspect mentions in post text and (2) bootstrapping-based classification of agree and disagree relations between posts. From the social network analysis perspective, we form two assumptions: (1) user consistency across different posts in the same thread and (2) user-user relation consistency in the same thread. Finally, to consolidate all the signals together in a unified way, we design an