EVOLUTION OF SOCIAL P2P NETWORKS BASED ON THE DYNAMICS OF HETEROGENEOUS MULTIMEDIA PEERS Hyunggon Park and Mihaela van der Schaar Multimedia Communications and Systems Lab. Electrical Engineering Department, University of California, Los Angeles (UCLA) ABSTRACT In this paper, we consider social peer-to-peer (P2P) networks, where peers are sharing their resources (i.e., multimedia con- tent and upload bandwidth). In the considered P2P networks, peers are self-interested, thereby determining their resource divisions (i.e., actions) among their associated peers such that their utility (e.g., multimedia quality) is maximized. Peers determine their optimal strategies for selecting their action based on a Markov Decision Process (MDP) framework, which enables the peers to maximize their cumulative utili- ties. We consider heterogeneous peers that have different and limited ability to characterize their resource reciprocations using only a limited number of states. We investigate how the limited number of states impacts the resource reciprocation and the resulting multimedia quality over time. Simulation results show that peers simultaneously refining their state de- scriptions can improve the multimedia quality in the resource reciprocation. Moreover, peers prefer to interact with other peers that have higher available upload bandwidths as well as have similar capabilities for refining their number of states. Index Terms— Social peer-to-peer (P2P) network, evo- lution of resource reciprocation, Markov decision process. 1. INTRODUCTION Social network communities such as [1–3] have recently be- come popular, and among them, peer-to-peer (P2P) applica- tions represent a large majority of the traffic currently trans- mitted over the Internet. The traffic exchanged is often multi- media content, e.g. downloads of multimedia data. Recently, several solutions have been proposed for gen- eral file sharing [2,4,5] and multimedia streaming [4,6] in P2P networks. Among these solutions, we consider data-driven approaches [4–6], where multimedia content or general files of each peer are divided into chunks of uniform length and are then disseminated over the P2P network. Based on the chunk availability, peers form groups with which they can continu- ously exchange their chunks. While this approach has been successfully deployed in P2P applications, key challenges such as determining optimal resource reciprocation strategies among self-interested peers still remain largely unaddressed. A resource reciprocation strategy among self-interested peers in BitTorrent systems has been developed based on a tit- for-tat (TFT) strategy, where a peer selects some of its associ- ated peers (i.e., leechers) which are currently uploading at the highest rates to download its content [5]. A key disadvantage of this method is that a peer deploying this strategy decides its resource reciprocation by evaluating only the current up- load rates which it receives from its associated peers. Thus, the resource reciprocation is determined myopically. How- ever, since peers in P2P networks are generally involved in repeated, long-term interactions, such myopic decisions can result in a suboptimal performance for the involved peers. To take into account the repeated resource reciprocation among self-interested peers, each peer determines its actions by considering the probabilistic behavior of resource recipro- cation of its associated peers. Formalizing the resource recip- rocation based on a MDP [7] enables the peers to take their foresighted actions in a way that maximizes their expected cumulative rewards (e.g., download rates or multimedia qual- ity). While our previous work [8] shows that the MDP-based foresighted strategies improve the performance of the P2P ap- plications, it does not investigate how heterogeneous peers in- teract with each other based on their different abilities. We consider heterogeneous peers that have different and limited abilities to characterize their resource reciprocation. The resource reciprocation of each peer is described based on a finite number of state descriptions. Hence, the heteroge- neous peers cannot differentiate among all possible download rates from their associated peers. Consequently, a peer may have multiple actions that are optimal because these actions do not alter its associated peers’ states, and thus, they do not alter the resource reciprocation of these peers. We analyti- cally show that multiple optimal actions exist for such hetero- geneous peers. Moreover, we show that peers can mutually improve their download rates only if they simultaneously re- fine their state descriptions. This paper is organized as follows. In Section 2, the MDP- based resource reciprocation strategy for P2P networks is pre- sented. In Section 3, we study the evolution of resource recip- rocation for heterogeneous peers. Simulation results are pre- sented in Section 4 and conclusions are drawn in Section 5.