A Peer’s-Eye View: Network Term Clouds in a Peer-to-Peer System Raynor Vliegendhart R.Vliegendhart@tudelft.nl Martha Larson M.A.Larson@tudelft.nl Christoph Kofler C.Kofler@tudelft.nl Johan Pouwelse J.A.Pouwelse@tudelft.nl Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands ABSTRACT We investigate term clouds that represent the content avail- able in a peer-to-peer (P2P) network. Such network term clouds are non-trivial to generate in distributed settings. Our term cloud generator was implemented and released in Tribler—a widely-used, server-free P2P system—to sup- port users in understanding the sorts of content available. Our evaluation and analysis focuses on three aspects of the clouds: coverage, usefulness and accumulation speed. A live experiment demonstrates that individual peers accumulate substantial network-level information, indicating good cov- erage of the overall content of the system. The results of a user study carried out on a crowdsourcing platform confirm the usefulness of clouds, showing that they succeed in con- veying to users information on the type of content available in the network. An analysis of five example peers reveals that accumulation speeds of terms at new peers can support the development of a semantically diverse term set quickly after a cold start. This work represents the first investiga- tion of term clouds in a live, 100% server-free P2P setting. Categories and Subject Descriptors: H.3 [Information Storage and Retrieval]: H.3.3 Information Search and Retrieval—Search process ; H.3.4 Systems and Software—Distributed systems General Terms: Performance, Experimentation Keywords: P2P, term cloud, gossip protocol, user study 1. INTRODUCTION New users encountering a search system can search more effectively if they have appropriate expectations of the sort of content that can be found in the system. Tribler is a real- world peer-to-peer (P2P) file-sharing system (downloadable from http://www.tribler.org) that offers a search function- ality [7]. We developed and implemented a term cloud gen- erator in order to promote successful searches by provid- ing users with an impression of the types of content avail- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CIKM’11, October 24–28, 2011, Glasgow, Scotland, UK. Copyright 2011 ACM 978-1-4503-0717-8/11/10 ...$10.00. able in the system. Informal observation of user interaction patterns suggests that users having more experience with the Tribler system formulate a greater number of successful queries. The term clouds are intended to provide a quicker substitute for system interaction experience. If the clouds support users in understanding which information needs Tri- bler can fulfill, it can be expected that their queries better match the content of the system, leading, in the long term, to higher satisfaction and better user retention rates. The term clouds are generated using the frequency counts of terms extracted from the names of files within the net- work that are accumulated at an individual peer by way of the underlying process used to exchange information among peers. This paper investigates the question of whether ef- fective term clouds reflecting overall network content can be created in a distributed environment. We focus on three aspects: coverage, usefulness and accumulation speed. Note that this focus excludes investigation of cloud animation. Here, we simply state that animation switches cloud views at regular intervals to give the user the impression of the scope and dynamic development of the content of the sys- tem. Analysis of use pattern statistics and long-term impact on the uptake of Tribler are also left for future work. In a completely distributed environment such as Tribler, building a network term cloud is non-trivial. Within the net- work, content is stored not on a central server with a ‘bird’s- eye’ view, but rather at the individual peers. An individual peer can receive information about content at other peers only by communicating with its direct neighbors. In other words, in an environment that is 100% server free, the only view of the content collection that is available is a ‘peer’s- eye’ view. In order for term clouds to be useful, the commu- nication between peers must provide fast and high-coverage information about the content in the network. The key con- tribution of this paper is to demonstrate that a server-free architecture does not prevent peers from generating clouds that provide a global overview and are helpful for users. After presenting background and related work, we report results of a live discovery experiment investigating cloud cov- erage, i.e., how well ‘peer’s-eye’ clouds reflect network-level content. Then, we investigate the usefulness of the clouds, i.e., their ability to convey an impression of the content of the network to users, with a user study. Finally, we exam- ine the accumulation speed of the clouds with a qualitative analysis of the cold start phase of example peers that reflects the experience of new users entering the network. We finish with our conclusion and outlook.