Equinox: Adaptive Network Reservation in the Cloud
Praveen Kumar*, Garvit Choudhary
t
, Dhruv Sharma*, Vijay Mann*
*IBM Research, India
{praveek7, dhsharm4, vijamann}@in.ibm.com
t
nT Roorkee, India
{garv7uec}@iitr.ac.in
Abstct-Most of today's public cloud services provide dedi-
cated compute and memory resources but they do not provide
any dedicated network resources. The shared network can be
a major cause of the well known "noisy neighbor" problem,
which is a growing concern in public cloud services like Amazon
EC2. Network reservations, therefore, are of prime importance
for the Cloud. However, a tenant's network demand would usually
keep changing over time and thus, a static one-time reservation
would either lead to poor performance or resource wastage (and
higher cost). In this context, we present Equinox - a system that
automatically reserves end-to-end bandwidth for a tenant based
on the predicted demand and adapts this reservation with time.
We leverage flow monitoring support in virtual switches to collect
flow data that helps us predict demand at a future time. We use a
combination of vswitch based rate-limiting and OpenFlow based
flow rerouting to provision end-to-end bandwidth requirements.
We have implemented Equinox in an OpenS tack environment
with OpenFlow based network control. Our experimental results,
using traces based on Facebook's production data centers, show
that Equinox can provide up to 47% reduction in bandwidth cost
as compared to a static reservation scheme while providing the
same efficiency in terms of flow completion times.
I. INTRODUCTION
Sharing of resources such as CPU, memory and network are
the foundation of any public cloud service. In such a scenario,
security and performance become a prime concern for a cloud
tenant. Given that today's data center networks [1] are highly
oversubscribed, network guarantees gain prime importance
when network is shared among multiple tenants. Recent works
[2], [3], [1] highlight some of the key requirements for high
performance in a shared network. According to [4], the top
requirements include minimum bandwidth guantee for a
VM in the worst case traffic, simplici for the tenant in
terms of inputs, high utilization of the network resources and
scalabili.
Most of today's public cloud solutions (like Amazon EC2
[5] and Microsoft Azure [6]) provide dedicated compute and
memory resources (EC2 offers dedicated instances), but they
do not provide any dedicated network resources. At best, EC2
provides cluster networking, which ensures that all instances of
a tenant are placed in a network that offers high-speed and low-
latency. Noisy neighbor [7] is a growing concern in commer-
cial cloud systems like EC2. Dedicated server resources ensure
that there are no noisy neighbors sharing the same server
hardware that can affect each other's performance. However,
the network is still shared by different tenants' applications and
in the absence of bandwidth reservations, a bandwidth hungry
978-1-4799-3635-9/14/$31.00 ©2014 IEEE
application belonging to a tenant may affect the performance
of an application belonging to another tenant.
Given the shared nature of the data center network, it be-
comes important for a tenant to have some network guarantee
to estimate the performance of the applications. Static reserva-
tions have been proposed before in literature [8], [2]. However,
they have not been implemented in real cloud systems like EC2
for three main reasons:
1) End-to-end reservations are hard to enforce in a multi-
tenant cloud - one needs to enforce the reservation
at the host NIC level and then across the entire path
that different flows originating om a tenant's VMs
may take. Traditional networking equipment does not
provide this type of fine grained flow-level reservations.
2) Making a static reservation may be non-trivial for
tenants. It is rare that tenants have a precise idea about
how much bandwidth their applications might need.
If they oversubscribe, they will end up paying more,
and if they under-subscribe their application will face
performance problems.
3) Offering static reservation may be equally troublesome
for cloud providers - how many bandwidth slabs should
be offered to clients? Should tenants be charged at a flat
rate for each slab or should it be "pay for what you use"
(the general cloud philosophy)?
Furthermore, recent studies such as [9], [10], [11], [12]
show that there is significant variation with time in bandwidth
demands of applications. Therefore, a solution based on static
network reservations may result in under-provisioning for peak
periods leading to poor performance for the tenant while for
non-peak periods, it may result in over-provisioning leading
to wastage of resources [13]. There is a need for a network
reservation mechanism that dynamically adjusts the bandwidth
reservations based on the demand.
To overcome these problems, we present Equinox - a system
for adaptive end-to-end network reservations for tenants in a
cloud environment. Equinox implements adaptive end-to-end
network reservations in an OpenStack [14] cloud provisioning
system using network control provided by an OpenFlow con-
troller. Equinox leverages end-host Open vSwitch [15] based
flow monitoring to estimate network bandwidth demands for
each VM and automatically provisions that bandwidth end-
to-end. It implements reservations at the host level through
vswitch based rate limits enforced through OpenStack. To
ensure that flows get the required network bandwidth, Equinox
routes flows through network paths with sufficient network
bandwidth. We also describe practical ways in which Equinox