Adapting Data-Intensive Workloads to Generic
Allocation Policies in Cloud Infrastructures
Ioannis Kitsos
*
, Antonis Papaioannou
*
, Nikos Tsikoudis
*
, and Kostas Magoutis
Institute of Computer Science (ICS)
Foundation for Research and Technology Hellas (FORTH)
Heraklion GR-70013, Greece
{kitsos,papaioan,tsikudis,magoutis}@ics.forth.gr
Abstract—Resource allocation policies in public Clouds are today
largely agnostic to requirements that distributed applications
have from their underlying infrastructure. As a result,
assumptions about data-center topology that are built-into
distributed data-intensive applications are often violated,
impacting performance and availability goals. In this paper we
describe a management system that discovers a limited amount of
information about Cloud allocation decisions –in particular VMs
of the same user that are collocated on a physical machine— so
that data-intensive applications can adapt to those decisions and
achieve their goals. Our distributed discovery process is based on
either application-level techniques (measurements) or a novel
lightweight and privacy-preserving Cloud management API
proposed in this paper. Using the distributed Hadoop file system
as a case study we show that VM collocation in a Cloud setup
occurs in commercial platforms and that our methodologies can
handle its impact in an effective, practical, and scalable manner.
Keywords: Cloud management; distributed data-intensive
applications.
I. INTRODUCTION
*
Cloud computing [3] has become a dominant model of
infrastructural services replacing or complementing traditional
data centers and providing distributed applications with the
resources that they need to serve Internet-scale workloads. The
Cloud computing model is centered on the use of virtualization
technologies (virtual machines (VMs), networks, and disks) to
reap the benefits of statistical multiplexing on a shared
infrastructure. However, these technologies also obstruct the
details of the infrastructure from applications. While this is not
an issue for some applications, others require knowledge of the
details of the infrastructure for reasons of performance and
availability.
Several distributed data-intensive applications [6][7][10]
replicate data onto different system nodes for availability. To
truly achieve availability though, nodes holding replicas of the
same data block must have independent failure modes. To
decide which nodes are thus suitable to hold a given set of
replicas, an application must have knowledge about data center
topology, which is often hidden from applications or provided
only in coarse form. Data center applications often try to
*
The authors are listed in alphabetical order.
978-1-4673-0269-2/12/$31.00 ©2012 IEEE
reverse-engineer data center structure by assuming a simple
model of infrastructure (e.g., racks of servers interconnected
via network switches) and attempt to derive deployment details
(i.e., whether two VMs are placed on the same physical
machine, same rack, or across racks) via network addressing
information. Most current Cloud-infrastructure services
however completely virtualize network resources (i.e., there is
no discernible mapping between virtual network assignments
and physical network structure), thus effectively hiding their
resource allocation decisions from applications.
Cloud management systems today use generic allocation
algorithms such as round-robin across servers or round-robin
across racks with the intent to spread a user’s VM allocation as
much as possible across the infrastructure. Two VMs however
can still end up being collocated because either the allocation
policy may in some cases favor this (e.g., when using large
core-count systems such as Intel’s Single-Chip Cloud (SCC)
[9]) or because of limited choices available to the allocation
algorithm at different times. While collocation may be
desirable for the high data throughput available between the
VMs, it is in general undesirable when an application wants to
decouple VMs with regards to physical machine or network
failure for the purpose of achieving high availability.
One option to provide applications with awareness of Cloud
resource-allocation policies is to place them inside the Cloud
and offer them as (what is popularly called) a platform-as-a-
service (or PaaS). This solution requires a close relationship
with the Cloud infrastructure provider and is thus not an option
for the average application developer. Another option is to
extend Cloud resource-allocation policies with application
awareness [12]. This solution ensures that application
requirements will be taken into account to some extent when
allocating resources. However deployment of the solution
requires significant changes to current Cloud management
systems and is thus hard to deploy. Finally, another approach
proposed in this paper is to infer or be explicitly provided with
a limited amount of information from the underlying Cloud.
Adaptation mechanisms often built into distributed data-
intensive applications (such as data migration) can leverage this
information to adapt to a Cloud’s generic allocation policies.
We believe that this approach is simpler and thus easier to
deploy compared to the other alternatives.
The two approaches we explore in this paper are: (a) the
black-box approach, namely to discover VM collocation via the