Social-Textual Search and Ranking ∗ Ali Khodaei Department of Computer Science University of Southern California Los Angeles,CA 90089,USA khodaei@usc.edu Cyrus Shahabi Department of Computer Science University of Southern California Los Angeles,CA 90089,USA shahabi@usc.edu ABSTRACT Web search engines are traditionally focused on textual con- tent of data. Emergence of social networks and Web 2.0 applications makes it interesting to see how social data can be used in improving the conventional textual search on the web. In this paper, we focus on how to improve the ef- fectiveness of web search by utilizing social data available from users, users actions and their underlying social net- work on the web. We deﬁne and formalize the problem of social-textual (socio-textual ) search and show how social as- pect of the web can be eﬀectively integrated into the textual search engines. We propose a new social relevance ranking based on several parameters including relationship between users, importance of each user and actions users perform on web documents (objects). We show how the proposed so- cial ranking can be combined with the conventional textual relevance ranking. We have conducted an extensive set of experiments on the data from online radio website last.fm to evaluate the eﬀectiveness of our proposed approaches. Our experimental results are very promising and show a signiﬁ- cant improvement for socio-textual ranking over textual only and social only approaches. 1. INTRODUCTION Social networks on the web have grown signiﬁcantly over the past few years. People have started to reconstruct their friendship networks in the virtual world and many of these virtual relationships are good representatives of their actual (friendship) networks in the real world. At the same time and with the emergence of Web 2.0, many web users have started to engage more with the web. In contrast to the tra- ∗ This research is supported in part by the NSF grant IS-1115153, the USC Integrated Media Systems Center (IMSC), and also by unrestricted cash and equipment gifts from Google, Microsoft and Qualcomm. The opinions, ﬁnd- ings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reﬂect the views of the National Science Foundation. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. CrowdSearch 2012 workshop at WWW 2012, Lyon, France. Copyright is held by the author(s). Copyright 2012 ACM X-XXXXX-XX-X/XX/XX ...$10.00. ditional web where users are often in read-only mode, Web 2.0 have enabled users to be in read-write mode. In other words, users have started to express themselves in the forms of generating and publishing content (e.g., writing a tweet), re-sharing interesting content by others (e.g., re-tweeting) and rating/evaluating the existing content (e.g., choosing a favorite tweet). This emergence of social networks and Web 2.0 resulted in huge amount of data available that can be utilized in many domains. In this paper, we focus on taking advantage of this infor- mation in the domain of (textual) web search. We argue that by integrating information from users’ social networks and their activities on the web, we can improve the conventional textual search and ranking. In today’s web, we can know the existence and degree of relationships among people and also at the same time have the knowledge of people’s interests de- rived from their actions/activities on the web. It is both in- tuitive and proven [3] that people have very similar interests with their friends. Also, people tend to trust the opinions and judgements of their friends more than strangers. We show how to modify the existing (textual) relevance rankings to take into consideration user’s social network in generating ranked results to the search queries. Consider the following example. A user searches for ”funny video clip”. Using con- ventional textual search, user will receive a ranked results of some funny video clips. On the other hand, using user’s social network, videos contain query keywords (i.e., funny video clips) that have more comments, likes or favorites by user’s friends should be ranked higher. However, the new ranked ranking is not trivial. Do we give more weights to textual keywords or to the social network? With social as- pect of the ranking, do we need to assign diﬀerent weights to diﬀerent friends of the user? How about the popularity of the users (friends) in general? Also, what are the actions that are important for objects and how we quantify those? In order to combine social data into textual relevance rank- ing, social relevance between users and objects (documents) has to be deﬁned ﬁrst. In order to model social relevance, existence and degree of relationships between users have to be taken into consideration. Also, actions permitted for each type of document (object) and their importance should be modeled. Finally, overall importance/impact of each user has to be considered as well. We ﬁrst review the few existing studies regarding social search and utilization of social networks in the web search. Then, we deﬁne and formalize our problem. Next, we present new scoring methods to calculate social relevance between users and documents (objects). We show how the impor-