A Demand based Algorithm for Rapid Updating of Replicas
Jesús Acosta-Elias, Leandro Navarro-Moldes
Polytechnic University of Catalonia, Spain
{jacosta, leandro}@ac.upc.es
Abstract
In many Internet scale replicated system, not all
replicas can be dealt with in the same way, since some will
be in greater demand than others. In the case of weak
consistency algorithms, we have observed that updating
first replicas having most demand, a greater number of
clients would gain access to updated content in a shorter
period of time.
In this work we have investigated the benefits that can
be obtained by prioritizing replicas with greater demand,
and considerable improvements have been achieved. In
zones of higher demand, the consistent state is reached up
to six times quicker than with a normal weak consistency
algorithm, without incurring the additional costs of the
strong consistency.
Keywords: Weak consistency, consistency algorithms,
replication, distributed system.
1. Introduction
1
There is a growing interest on Internet scale distributed
systems where many potential clients may contact a single
host to request a given service at almost the same time
from several locations. The presence of replica servers
may help to improve the situation because clients will be
able to contact the nearest replica. A Replica is a host who
provides exactly the same services as the principal host.
In this paper we will use the terms server and replica in the
same sense.
Content replication between servers in a distributed
system is justified by the need to reduce delay, to provide
availability and to be scalable [11], to tolerate failure in
the links, and also to withstand segmentation. The
algorithms currently available for replica updating can be
broadly classified into two groups, according to their
consistency:
- Strong consistency, and
- Weak consistency
Strong consistency algorithms are costly, non-scalable
on networks, not very reliable, generate considerable
latency and a great deal of traffic. They are suitable for
systems with a small number of replicas, in which it must
1
This work has been partially supported by the Mexican Ministry of Education
(Secretaría de Educación Pública) under contract PROMEP-57, and Spanish
MCyT project COSACO.
be guaranteed that all the replicas are in a consistent state
(i.e. all the replicas possess exactly the same content)
before any transaction can be carried out (synchronous
systems) [3, 14].
However, weak consistency algorithms [7, 13, 1]
generate very little traffic, low latency, and are more
scalable. They do not sacrifice either availability or reply
time in order to guarantee strong consistency, but only
need to ensure that the replicas eventually converge to a
consistent state in a finite, but not bounded, period of time.
They are very useful in systems where it is necessary for
all the replicas to be totally consistent in order for
transactions to be carried out (systems that withstand a
certain degree of asynchrony). This is the case of Usenet
news, or in computer-supported cooperative work systems.
With the weak consistency algorithm [7], each server
(replica) from time to time chooses a neighbour to start an
update session. In an update session two servers mutually
exchange summary vectors then they exchange some data
to mutually update their contents. At the end of the session
both servers will have the same mutually consistent
content. These are called anti-entropy sessions: It is called
an anti-entropy session because in each session between
replicas, the total entropy in the system is reduced. In this
paper it will be referred to simply as a “session”.
The metric principle to be employed is how many
sessions are necessary for a change brought about in a
replica to be propagated to all the others.
Golding [7] demonstrated that the neighbouring
server’s random choice has the best performance (fewest
number of sessions) for maintaining the consistency of the
replicas in a peer-to-peer network. This gives rise to the
fact that all the replicas are updated, no matter how much
demand (number of requests per unit of time) they have, in
such a way that a replica with low demand can be updated
first, before another with much greater demand.
In most distributed applications, some replicas tend to
have more demand than others due to different factors,
such as:
- Geographical distribution
- Number of clients,
- Number of requests arising from more intense work
among clients.
Proceedings of the 22 nd International Conference on Distributed Computing Systems Workshops (ICDCSW’02)
0-7695-1588-6/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: IEEE Xplore Customer. Downloaded on October 13, 2008 at 08:12 from IEEE Xplore. Restrictions apply.