Cluster Comput (2009) 12: 123–140
DOI 10.1007/s10586-009-0076-0
On GPU’s viability as a middleware accelerator
Samer Al-Kiswany · Abdullah Gharaibeh ·
Elizeu Santos-Neto · Matei Ripeanu
Received: 1 January 2009 / Accepted: 5 January 2009 / Published online: 17 January 2009
© Springer Science+Business Media, LLC 2009
Abstract Today Graphics Processing Units (GPUs) are a
largely underexploited resource on existing desktops and
a possible cost-effective enhancement to high-performance
systems. To date, most applications that exploit GPUs are
specialized scientific applications. Little attention has been
paid to harnessing these highly-parallel devices to support
more generic functionality at the operating system or mid-
dleware level. This study starts from the hypothesis that
generic middleware-level techniques that improve distrib-
uted system reliability or performance (such as content ad-
dressing, erasure coding, or data similarity detection) can be
significantly accelerated using GPU support.
We take a first step towards validating this hypothesis and
we design StoreGPU, a library that accelerates a number of
hashing-based middleware primitives popular in distributed
storage system implementations. Our evaluation shows that
StoreGPU enables up twenty five fold performance gains on
synthetic benchmarks as well as on a high-level application:
the online similarity detection between large data files.
Keywords Middleware · Storage system · Graphics
Processing Unit · GPU hashing · StoreGPU
S. Al-Kiswany () · A. Gharaibeh · E. Santos-Neto · M. Ripeanu
Electrical and Computer Engineering Department, The University
of British Columbia, Vancouver, BC Canada, V6T 1Z4
e-mail: samera@ece.ubc.ca
A. Gharaibeh
e-mail: abdullah@ece.ubc.ca
E. Santos-Neto
e-mail: elizeus@ece.ubc.ca
M. Ripeanu
e-mail: matei@ece.ubc.ca
1 Introduction
Recent advances in processor technology [1] have re-
sulted in a wide availability of massively parallel Graphics
Processing Units (GPUs). Low-end GPUs like NVIDIA’s
GeForce 8600 priced at about $100 have 32 processors and
256 MB of memory while high-end GPUs, like the NVIDIA
8800 GTX priced at about $300, have up to 128 processors
running at 575 MHz and 768 MB of memory, for instance.
With these characteristics, GPUs are often underutilized in
desktops deployments (as these are generally provisioned
for graphics-intensive workloads such as high-definition
video) and may be cost-effective enhancements to high-end
server systems.
However, the constraints introduced by the GPU pro-
gramming model which, until recently, specialized in sup-
porting only graphical processing, have led past efforts
aimed at harnessing this resource to focus exclusively on
computationally intensive scientific applications [2]. Al-
though these efforts confirmed that significant speedup is
achievable, the development cost for this specialized plat-
form was often prohibitive. Recently, however, the in-
troduction of general-purpose programming models (e.g.,
NVIDIA’s CUDA [3]) lowered the development cost mak-
ing GPUs attractive to a broader spectrum of applications.
Additionally, improvements on GPUs architecture created
the opportunity to data intensive applications to benefit from
GPUs.
This study starts from the observation that a number of
techniques that enhance the reliability and/or performance
of distributed storage systems (e.g., content addressability
in data storage [4, 5], erasure codes [6], on-the-fly data sim-
ilarity detection [7]) incur computational overheads that of-
ten preclude their effective usage with today’s commodity