J. Cent. South Univ. (2014) 21: 3864−3872
DOI: 10.1007/s11771-014-2373-x
Task scheduling scheme by checkpoint sharing and
task duplication in P2P-based desktop grids
Joon-Min Gil, Young-Sik Jeong
1. School of IT Engineering, Catholic University of Daegu, 13-13, Hayang-ro,
Hayang-eup, Gyeongsan-si, Gyeongbuk 712-701, Korea;
2. Department of Multimedia Engineering, Dongguk University, 30 Pildong-rol-gil,
Jung-gu, Seoul 100-715, Korea
© Central South University Press and Springer-Verlag Berlin Heidelberg 2014
Abstract: A scheduling scheme is proposed to reduce execution time by means of both checkpoint sharing and task duplication
under a peer-to-peer (P2P) architecture. In the scheme, the checkpoint executed by each peer (i.e., a resource) is used as an
intermediate result and executed in other peers via its duplication and transmission. As the checkpoint is close to a final result, the
reduction of execution time for each task becomes higher, leading to reducing turnaround time. To evaluate the performance of our
scheduling scheme in terms of transmission cost and execution time, an analytical model with an embedded Markov chain is
presented. We also conduct simulations with a failure rate of tasks and compare the performance of our scheduling scheme with that
of the existing scheme based on client-server architecture. Performance results show that our scheduling scheme is superior to the
existing scheme with respect to the reduction of execution time and turnaround time.
Key words: P2P-based desktop grids; checkpoint sharing; task duplication; embedded Markov chain
1 Introduction
Desktop grids are used in a practical computing
paradigm that can process massive computational tasks
in various application areas, using the idle cycles of the
heterogeneous resources (generally desktop computers)
connected over the Internet and owned by different
individual users. They are generally suitable for the
large-scale applications composed of hundreds of
thousands of small-sized tasks for the same
computational code. It is well-known that desktop grids
make it possible to obtain large-scale computing power
with a low cost [1−2]. Since the success of SETI@Home
[3−4], a variety of desktop grid platforms, such as
BOINC [5−6], XtremWeb [7], Korea@Home [8],
SZTAKI [9], QADPZ [10], have been developed. The
commercial desktop grid systems, such as Entropia [11]
and United Devices [12], are released for enterprise
computing, and some practical applications for desktop
grids are reported in Refs. [13−14].
An important aspect in desktop grids is that each
resource has a volatility property, due to free withdrawal
from execution participation even in the middle of task
execution. Moreover, each resource has a heterogeneity
property as it has a totally different computing
environment (e.g., CPU performance, memory capacity,
and network speed) [15]. One critical issue of a desktop
grid environment is to minimize the execution time of all
tasks, even if these two properties affect overall
performance adversely [1]. Unexpected failures can be
considered degrading factors in the minimization of
execution time, which can be partially addressed with the
use of a checkpointing mechanism at the application
level [16−17]. Another method of minimizing the
execution time is to share all of the checkpoints
performed on each resource [18]. Checkpoint sharing is a
method of reusing the checkpoint, which has been
recently performed on a local desktop in another
resource (i.e., the intermediate result of a task is
transmitted to other resources so that task execution from
the last checkpoint position can be restated).
Consequently, the purpose of checkpoint sharing is
to reduce the execution time of tasks, leading to a
reduction in turnaround time. Most desktop grid systems,
however, use a client-server model as their main
architecture [6, 11, 19]. Although this model is simple in
architecture as well as in the control of resources and
tasks, it concentrates all functions on the central server,
which heightens the bottleneck phenomenon in the server.
Moreover, in the client-server model, checkpoint sharing
is based on storing checkpoints in a central stable
Received date: 2013−11−20; Accepted date: 2014−01−16
Corresponding author: Young-Sik Jeong; E-mail: ysjeong@dongguk.edu