F UNNEL: Choking Polluters in BitTorrent File Sharing Communities Fl´ avio Roberto Santos, Member, IEEE, Weverton Luis da Costa Cordeiro, Member, IEEE, Luciano Paschoal Gaspary, Member, IEEE, and Marinho Pilla Barcellos, Member, IEEE Abstract—BitTorrent-based file sharing communities are very popular nowadays. Anecdotal evidence hints that such communi- ties are exposed to content pollution attacks (i.e., publication of ‘false’ files, viruses, or other malware), requiring a moderation effort from their administrators. The size of such a cumber- some task increases with content publishing rate. To tackle this problem, we propose a generic pollution control strategy and instantiate it as a mechanism for BitTorrent communities. The strategy follows a conservative approach: it regards newly published content as polluted, and allows the dissemination rate to increase according to the proportion of positive feedback issued about the content. In contrast to related approaches, the strategy and mechanism avoid the problem of pollution dissemination at the initial stages of a swarm, when insufficient feedback is available to form a reputation about the content. To evaluate the proposed solution, we conducted a set of experiments using a popular BitTorrent agent and an implementation of our mechanism. Results indicate that the proposed approach mitigates the dissemination of polluted content in BitTorrent, imposing a low overhead in the distribution of non-polluted ones. Index Terms—peer-to-peer; pollution; BitTorrent; experimen- tal evaluation; I. I NTRODUCTION P eer-to-Peer (P2P) file sharing systems have been widely and largely adopted for content dissemination among users in the Internet. Popular instantiations include Gnutella, KaZaA, eDonkey, and BitTorrent. Despite their popularity, P2P file sharing applications have been suffering from different kinds of denial-of-service attacks, such as query flooding and pollution. For example, in the content pollution attack, a malicious user publishes a large number of decoys (same or similar meta-data), so that queries for a given content return predominantly fake/corrupted copies (e.g., a blank media file or executable infected with virus) [1]. Another attack is meta-data pollution, which consists of publishing a file with misleading meta-data, inducing users to download files that do not correspond to the desired content. A third kind of attack, known as index poisoning [2], consists in creating many bogus records which associate titles with identifiers of non-existing copies, having false IP addresses/port numbers. Studies (e.g., [3], [4]) looked at the impact of these attacks on Gnutella and KaZaa networks. Manuscript received January 5, 2011; revised May 20, 2011 and July 19, 2011. The associate editor coordinating the review of this paper and approving it for publication was David Hutchison. Authors are with the Institute of Informatics – Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil (email: {flavio.santos, weverton.cordeiro, paschoal, marinho}@inf.ufrgs.br). In the context of BitTorrent, DoS attacks exploiting vul- nerabilities in the protocol were first addressed in [5], and countermeasures to those attacks, in [6]. Later, an experimental study [7] confirmed that DoS attacks were in fact being launched against BitTorrent networks. Fresher evidence indi- cates that content pollution is currently a very common DoS attack vector against BitTorrent [8], [9]. The attack consists in posting torrent files with fake content into user’s communities, which manage to share various kinds of content (e.g., digital documents, software updates, etc). There has been substantial research on means to mitigate content pollution in P2P systems, leading to the proposal of several generic countermeasure mechanisms [10], [11], [12]. Although these are interesting strategies to fight pollution, only Credence [12] is supported by an implementation and an experimental evaluation. The applicability of other approaches remains uncertain. In the BitTorrent realm, to the best of our knowledge, there exists only a set of ad hoc solutions designed specifically to mitigate pollution in user’s communities. These are simplistic solutions, such as discussion forums where users can post tes- timonials about contents, reporting mechanisms to notify the community administrators, and voting mechanisms to automat- ically isolate suspicious contents. These approaches, however, require a non-negligible moderation effort through manual inspection of the contents. Besides, there are no mechanisms to provide incentives for users to cooperate (i.e., reporting pollution, posting comments, or voting) against pollution. Aiming at tackling the aforementioned problem, we pro- posed in a previous work [13] a conservative strategy, and corresponding mechanism, to control content pollution dissem- ination in BitTorrent communities. The strategy counts positive and negative votes assigned to contents in order to classify them as either non-polluted or polluted. The mechanism, called FUNNEL, operates by controlling the distribution of copies according to votes assigned by community users. That is, the number of concurrent downloads permitted is affected by the proportion between positive and negative votes. In the present paper, we extend our previous work and elaborate on technical aspects that had not been addressed earlier. More specifically, we (i) present a new measurement study on the dissemination of polluted content in BitTorrent file sharing communities, (ii) emphasize deployment aspects of FUNNEL, and (iii) provide a detailed analysis to show the effectiveness and efficiency of FUNNEL in tackling the dissemination of polluted content. From the distributed systems operations and management point of view, this paper follows the track of previous investi-