Jelly: A Dynamic Hierarchical P2P Overlay Network
with Load Balance and Locality
Richard Hsiao and Sheng-De Wang
Department of Electrical Engineering
National Taiwan University, Taipei 106, TAIWAN
Abstract
P2P systems based on Distributed hash table (DHT)
such as CAN, Chord, Pastry, and Tapestry, use uniform
hash functions to ensure load balance in each participant
nodes. But the evenly distributed behavior in the virtual
space destroys the locality between participant nodes. The
topology-based hierarchical overlay network like Grapes,
exploits the physical distance information among the nodes
to construct a two-layered hierarchy, highly improves the
locality, but damages the load balance property in original
DHTs. In this paper, we propose a dynamic P2P overlay
infrastructure, called Jelly. It can achieve both the load
balancing and locality properties. Its design is based on
the hierarchical overlay and uses the DHT as its routing
algorithm. Because the load balancing issue in a
hierarchical overlay is originated from whether the virtual
hierarchy is balanced or not, Jelly uses a node joining
mechanism as a fine-tuning tool and a dynamic checking
mechanism as a coarse-tuning tool to balance the
hierarchy. We also find that the average routing hops is a
practical metric to evaluate the network size, and it is
useful for Jelly’s dynamic mechanism.
1. Introduction
In recent years, peer-to-peer (P2P) systems have
been the burgeoning research topic in large
distributed system. Gnutella [1] and Napster [2] are
the most famous peer-to–peer file sharing systems
among these, but both of them have the scalability
problem. To address this problem, distributed hash
tables (DHT) have became an fundamental part to
build peer-to-peer overlay networks , CAN [3] ,
Chord [4] , Pastry [5] , Tapestry [6] are well-known
works of these infrastructures. Many applications are
layered above DHTs, such as file sharing systems [7]
[8] [9], event notification services [10] [11], and
application-layer multicast [12] [13] [14]. Although
each of them has different location and routing
algorithms, all of them have the same feature, using
consistent hashing (like SHA-1) to let the participant
nodes and objects distributed uniformly in its virtual
space; in general condition, these systems can
achieve fairly good load balancing property .
But the primitive DHT schemes have a
significant disadvantage that they may violate the
locality property. During the locating and routing
process, the messages choose the next hop to a host
regardless of the physical topology information.
This produces inefficient effects in response time
and overall physical path length for lookup service.
To address this problem, the DHTs should take
into consideration of the relative physical position
among the participant nodes. All of these systems
have designed some similar approaches like [18], to
exploit locality by measuring proximity metric like
round trip time (RTT) or the IP level hops. This
improvement assures the next hop selection is the
relatively closer node on the underlying network that
matches the routing condition, but the physical
distance between the nodes looking for the object and
the nodes storing that object could be still long.
Grapes [15] provide the hierarchical virtual network
infrastructure using physical topology information. It
has two-layered overlay network, the upper layer
called super-network, the lower layer called
sub-network; in both layers, any DHTs routing
algorithm can be used. Each sub-network has a leader
joining the super-network routing and managing the
sub-network. The physically nearby nodes construct
the sub-network, and during each super-network
query, the leader caches the object in its sub-network.
Finally, a node can find the object in its sub-network
with high probability, because the physical distance
of any node pairs in sub-network is short, and thus
this infrastructure can greatly reduce the lookup
distance.
Although hierarchical overlay network like
Grapes can highly improve the locality property of
DHTs, it does not have the load balancing property. If
DHT can provide load balance, then each leader in
super-network is assigned to nearly the same load.
After the lower-layered mapping, the load of each
node in the entire system will no longer balance; the
larger of the sub-network’s size is, the lighter of the
load will be assigned to its subnodes. Grapes does not
provide any mechanism to adjust the size of
sub-network, as a result of its node joining algorithm,
producing some extremely large sub-network and a
significant amount of sub-network with relatively
few subnodes.
To address both the load balancing and locality
problems, we propose Jelly, a dynamic hierarchical
overlay network. Our main goal is to construct and
maintain the well-balanced two-layered overlay
network (the distribution of each sub-network’s size
within a given range), assure each participant node be
assigned to similar load. Jelly’s node joining
mechanism is similar to Grapes. The difference is a
newly-joined node not only checks the physical
distance between each leader on the path in the
inserting process and itself is shorter than the
threshold or not, but also considers the size of the
sub-network that each leader manages. If the size is
larger than the given threshold, it is not appropriate to
add one more node to this sub-network, because this
may deteriorate the unbalance of entire hierarchy.
Therefore, only when the newly-joined node finds a
Proceedings of the 24th International Conference on Distributed Computing Systems Workshops (ICDCSW’04)
0-7695-2087-1/04 $20.00 © 2004 IEEE