World Wide Web
DOI 10.1007/s11280-012-0165-5
SocialSearch
+
: enriching social network with web
evidences
Gae-won You · Jin-woo Park · Seung-won Hwang ·
Zaiqing Nie · Ji-Rong Wen
Received: 15 July 2011 / Revised: 29 March 2012 /
Accepted: 9 April 2012
© Springer Science+Business Media, LLC 2012
Abstract This paper introduces the problem of searching for social network ac-
counts, e.g., Twitter accounts, with the rich information available on the Web, e.g.,
people names, attributes, and relationships to other people. For this purpose, we
need to map Twitter accounts with Web entities. However, existing solutions building
upon naive textual matching inevitably suffer low precision due to false positives
(e.g., fake impersonator accounts) and false negatives (e.g., accounts using nick-
names). To overcome these limitations, we leverage “relational” evidences extracted
from the Web corpus. We consider two types of evidence resources—First, web-scale
entity relationship graphs, extracted from name co-occurrences crawled from the
Web. This co-occurrence relationship can be interpreted as an “implicit” counterpart
of Twitter follower relationships. Second, web-scale relational repositories, such as
Freebase with complementary strength. Using both textual and relational features
obtained from these resources, we learn a ranking function aggregating these features
for the accurate ordering of candidate matches. Another key contribution of this
paper is to formulate confidence scoring as a separate problem from relevance
This work builds on and significantly extends our preliminary work [23].
G.-w. You · J.-w. Park · S.-w. Hwang (B )
Pohang University of Science and Technology, Pohang, Republic of Korea
e-mail: swhwang@postech.ac.kr
G.-w. You
e-mail: gwyou@postech.ac.kr
J.-w. Park
e-mail: jwpark85@postech.ac.kr
Z. Nie · J.-R. Wen
Microsoft Research Asia, Beijing, People’s Republic of China
Z. Nie
e-mail: znie@microsoft.com
J.-R. Wen
e-mail: jrwen@microsoft.com