576 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008 SybilGuard: Defending Against Sybil Attacks via Social Networks Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, Member, IEEE, and Abraham D. Flaxman Abstract—Peer-to-peer and other decentralized, distributed sys- tems are known to be particularly vulnerable to sybil attacks. In a sybil attack, a malicious user obtains multiple fake identities and pretends to be multiple, distinct nodes in the system. By control- ling a large fraction of the nodes in the system, the malicious user is able to “out vote” the honest users in collaborative tasks such as Byzantine failure defenses. This paper presents SybilGuard, a novel protocol for limiting the corruptive influences of sybil attacks. Our protocol is based on the “social network” among user identities, where an edge between two identities indicates a human-estab- lished trust relationship. Malicious users can create many identi- ties but few trust relationships. Thus, there is a disproportionately small “cut” in the graph between the sybil nodes and the honest nodes. SybilGuard exploits this property to bound the number of identities a malicious user can create. We show the effectiveness of SybilGuard both analytically and experimentally. Index Terms—Social networks, sybil attack, SybilGuard, sybil identity. I. INTRODUCTION A S THE SCALE of a decentralized distributed system in- creases, the presence of malicious behavior (e.g., Byzan- tine failures) becomes the norm rather than the exception. Most designs against such malicious behavior rely on the assumption that a certain fraction of the nodes in the system are honest. For example, virtually all protocols for tolerating Byzantine failures assume that at least 2/3 of the nodes are honest. This makes these protocols vulnerable to sybil attacks [1], in which a malicious user takes on multiple identities and pretends to be multiple, dis- tinct nodes (called sybil nodes or sybil identities) in the system. With sybil nodes comprising a large fraction (e.g., more than 1/3) of the nodes in the system, the malicious user is able to “out vote” the honest users, effectively breaking previous de- fenses against malicious behaviors. Thus, an effective defense against sybil attacks would remove a primary practical obstacle to collaborative tasks on peer-to-peer (p2p) and other decentral- ized systems. Such tasks include not only Byzantine failure de- Manuscript received January 31, 2007; revised October 31, 2007; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor D. Yau. This work was supported in part by NUS Grant R-252-050-284-101 and Grant R-252-050-284- 133. A preliminary version of this paper appeared in the Proceedings of the ACM SIGCOMM 2006 Conference, Pisa, Italy. H. Yu is with the Computer Science Department, National University of Sin- gapore, Singapore 117543 (e-mail: haifeng@comp.nus.edu.sg). M. Kaminsky and P. B. Gibbons are with Intel Research Pittsburgh, Pitts- burgh, PA 15213 USA (e-mail: michael.e.kaminsky@intel.com; phillip.b. gibbons@intel.com). A. D. Flaxman was with Carnegie Mellon University, Pittsburgh, PA 15213 USA. He is now with Microsoft Research, Redmond, WA 98052 USA (e-mail: abie@microsoft.com). Digital Object Identifier 10.1109/TNET.2008.923723 fenses, but also voting schemes in file sharing, DHT routing, and identifying worm signatures or spam. Problems With Using a Central Authority. A trusted central authority that issues and verifies credentials unique to an actual human being can control sybil attacks easily. For example, if the system requires users to register with government-issued social security numbers or driver’s license numbers, then the barrier for launching a sybil attack becomes much higher. The central au- thority may also instead require a payment for each identity. Un- fortunately, there are many scenarios where such designs are not desirable. For example, it may be difficult to select/establish a single entity that every user worldwide is willing to trust. Further- more, the central authority can easily be a single point of failure, a single target for denial-of-service attacks, and also a bottle- neck for performance, unless its functionality is itself widely dis- tributed. Finally, requiring sensitive information or payment in order to use a system may scare away many potential users. Challenges in Decentralized Approaches. Defending against sybil attacks without a trusted central authority is much harder. Many decentralized systems today try to combat sybil attacks by binding an identity to an IP address. However, malicious users can readily harvest (steal) IP addresses. Note that these IP addresses may have little similarity to each other, thereby thwarting attempts to filter based on simple character- izations such as common IP prefix. Spammers, for example, are known to harvest a wide variety of IP addresses to hide the source of their messages, by advertising BGP routes for unused blocks of IP addresses [2]. Beyond just IP harvesting, a malicious user can co-opt a large number of end-user machines, creating a botnet of thousands of compromised machines spread throughout the Internet. Botnets are particularly hard to defend against because nodes in botnets are indeed distributed end users’ computers. The first investigation into sybil attacks [1] proved a series of negative results, showing that they cannot be prevented un- less special assumptions are made. The difficulty stems from the fact that resource-challenge approaches, such as computa- tion puzzles, require the challenges to be posed/validated simul- taneously. Moreover, the adversary can potentially have signif- icantly more resources than a typical user. Even puzzles that require human efforts, such as CAPTCHAs [3], can be reposted on the adversary’s web site to be solved by other users seeking access to the site. Furthermore, these challenges must be per- formed directly instead of trusting someone else’s challenge re- sults, because sybil nodes can vouch for each other. A more re- cent proposal [4] suggests the use of network coordinates [5] to determine whether multiple identities belong to the same user (i.e., have similar network coordinates). Despite its elegance, a malicious user controlling just a moderate number of network 1063-6692/$25.00 © 2008 IEEE