Elders Know Best - Handling Churn in Less Structured P2P Systems Yi Qiao and Fabi´ an E. Bustamante Department of Computer Science Northwestern University, Evanston, IL 60201, USA Email: yqiao,fabianb @cs.northwestern.edu Abstract We address the problem of highly transient populations in unstructured and loosely-structured peer-to-peer sys- tems. We propose a number of illustrative query-related strategies and organizational protocols that, by taking into consideration the expected session times of peers (their lifespans), yield systems with performance characteristics more resilient to the natural instability of their environ- ments. We first demonstrate the benefits of lifespan-based organizational protocols in terms of end-application per- formance and in the context of dynamic and heterogeneous Internet environments. We do this using a number of cur- rently adopted and proposed query-related strategies, in- cluding methods for query distribution, caching and repli- cation. We then show, through trace-driven simulation and wide-area experimentation, the performance advantages of lifespan-based, query-related strategies when layered over currently employed and lifespan-based organizational pro- tocols. While merely illustrative, the evaluated strategies and protocols clearly demonstrate the advantages of con- sidering peers’ session time in designing widely-deployed peer-to-peer systems. 1 Introduction Due in part to the autonomous nature of peers, their architectural mutual dependency, and their excessively large populations, the transiency of peer populations (a.k.a. churn) and its implications on P2P systems have recently at- tracted the attention of the research community [3, 19, 7, 27, 16]. Measurement studies of deployed P2P systems have reported median session times 1 varying from one hour to one minute [29, 6, 27]. The implications of such a high 1 Where a node’s session time is the time from the node’s joining to its subsequent leaving from the system. We employ lifespan and session time interchangeably. Another metric of transiency sometimes used, lifetime, refers instead to the the time between the node first entering the system and its final departure from it [27]. degree of transiency on the overall system’s performance would clearly depend on the level of nodes’ investment in their neighboring peers. At the very least, the amount of maintenance-related messages processed by any node is proportional to the degree of stability of the node’s neigh- boring set. Further, in the context of data-sharing P2P sys- tems, the level of replication, the effectiveness of caches, and the spread and satisfaction rate of queries will all be affected by how dynamic the peers’ population is. We address the problem of highly transient popula- tions in unstructured and loosely-structured peer-to-peer (P2P) systems (collectively, less structured P2P systems). Through active probing of over half-a-million peers in a widely-deployed P2P system, we determined that the ses- sion time of peers can be well modeled by a Pareto dis- tribution. In our context, this means that the expected re- maining session time of a peer is directly proportional to the session’s current length, i.e. the peer’s current age. This observation forms the basis for a set of new protocols for peer organization and query-related strategies that, by tak- ing into consideration the expected session times of peers (their lifespans), yield systems with performance character- istics more resilient to the natural instability of their envi- ronments. We first demonstrate the benefits of considering lifespan in organizational protocols - i.e. how peers organize them- selves in an overlay network - in terms of end-application performance and in the context of dynamic and heteroge- neous Internet environments. The lifespan-based approach for organizational protocols was first proposed in our po- sition paper [6], where we show its effectiveness in terms of increased system stability (e.g. an over 42% reduction on the ratio of connection breakdowns and their associated costs) through a simulation study. In this paper we go beyond those preliminary results to evaluate the advantages of the proposed approach in terms of application performance in a dynamic Internet testbed of 150 world-wide distributed PlanetLab nodes [25]. We do this using a set of illustrative organizational protocols combined with a number of currently adopted and proposed 1