Celebrating Diversity in Volunteer Computing
David P. Anderson
University of California, Berkeley
davea@ssl.berkeley.edu
Kevin Reed
IBM
knreed@us.ibm.com
Abstract
The computing resources in a volunteer computing
system are highly diverse in terms of software and
hardware type, speed, availability, reliability, network
connectivity, and other properties. Similarly, the jobs
to be performed may vary widely in terms of their
hardware and completion time requirements. To
maximize system performance, the system’s job
selection policy must accommodate both types of
diversity.
In this paper we discuss diversity in the context of
World Community Grid (a large volunteer computing
project sponsored by IBM) and BOINC, the
middleware system on which it is based. We then
discuss the techniques used in the BOINC scheduler to
efficiently match diverse jobs to diverse hosts.
1. Introduction
Volunteer computing is a form of distributed
computing in which the general public volunteers
processing and storage resources to computing
projects. BOINC is a software platform for volunteer
computing [2]. BOINC is being used by projects in
physics, molecular biology, medicine, chemistry,
astronomy, climate dynamics, mathematics, and the
study of games. There are currently 50 projects and
580,000 volunteer computers supplying an average of
1.2 PetaFLOPS.
Compared to other types of high-performance
computing, volunteer computing has a high degree of
diversity. The volunteered computers vary widely in
terms of software and hardware type, speed,
availability, reliability, and network connectivity.
Similarly, the applications and jobs vary widely in
terms of their resource requirements and completion
time constraints.
These sources of diversity place many demands on
BOINC. Foremost among these is the job selection
problem: when a client contacts a BOINC scheduling
server, the server must choose, from a database of
perhaps a million jobs, those which are “best” for that
client according to a complex set of criteria.
Furthermore, the server must handle hundreds of such
requests per second.
In this paper we discuss this problem in the context
of the IBM-sponsored World Community Grid, a large
BOINC-based volunteer computing project. Section 2
describes the BOINC architecture. Section 3
summarizes the population of computers participating
in World Community Grid, and the applications it
supports. In Section 4 we discuss the techniques used
in the BOINC scheduling server to efficiently match
diverse jobs to diverse hosts.
This work was supported by National Science
Foundation award OCI-0721124.
2. The BOINC model and architecture
The BOINC model involves projects and
volunteers. Projects are organizations (typically
academic research groups) that need computing power.
Projects are independent; each operates its own
BOINC server.
Volunteers participate by running BOINC client
software on their computers (hosts). Volunteers can
attach each host to any set of projects, and can specify
the quota of bottleneck resources allocated to each
project.
A BOINC server is centered around a relational
database, whose tables correspond to the abstractions
of BOINC’s computing model:
z Platform: an execution environment, typically the
combination of an operating system and processor
type (Windows/x86), or a virtual environment
(VMWare/x86 or Java).
z Application: the abstraction of a program,
independent of platforms or versions.
z Application version: an executable program.
Each is associated with an application, a platform,
a version number, and one or more files (main
program, libraries, and data).
z Job: a computation to be done. Each is associated
with an application (not an application version or
platform) and a set of input files.
z Job instance: the execution of a job on a
particular host. Each job instance is associated
with an application version (chosen when the job
is issued) and a set of output files.
Proceedings of the 42nd Hawaii International Conference on System Sciences - 2009
1 978-0-7695-3450-3/09 $25.00 © 2009 IEEE