Equinox: Adaptive Network Reservation in the Cloud Praveen Kumar*, Garvit Choudhary t , Dhruv Sharma*, Vijay Mann* *IBM Research, India {praveek7, dhsharm4, vijamann}@in.ibm.com t nT Roorkee, India {garv7uec}@iitr.ac.in Abstct-Most of today's public cloud services provide dedi- cated compute and memory resources but they do not provide any dedicated network resources. The shared network can be a major cause of the well known "noisy neighbor" problem, which is a growing concern in public cloud services like Amazon EC2. Network reservations, therefore, are of prime importance for the Cloud. However, a tenant's network demand would usually keep changing over time and thus, a static one-time reservation would either lead to poor performance or resource wastage (and higher cost). In this context, we present Equinox - a system that automatically reserves end-to-end bandwidth for a tenant based on the predicted demand and adapts this reservation with time. We leverage flow monitoring support in virtual switches to collect flow data that helps us predict demand at a future time. We use a combination of vswitch based rate-limiting and OpenFlow based flow rerouting to provision end-to-end bandwidth requirements. We have implemented Equinox in an OpenS tack environment with OpenFlow based network control. Our experimental results, using traces based on Facebook's production data centers, show that Equinox can provide up to 47% reduction in bandwidth cost as compared to a static reservation scheme while providing the same efficiency in terms of flow completion times. I. INTRODUCTION Sharing of resources such as CPU, memory and network are the foundation of any public cloud service. In such a scenario, security and performance become a prime concern for a cloud tenant. Given that today's data center networks [1] are highly oversubscribed, network guarantees gain prime importance when network is shared among multiple tenants. Recent works [2], [3], [1] highlight some of the key requirements for high performance in a shared network. According to [4], the top requirements include minimum bandwidth guantee for a VM in the worst case traffic, simplici for the tenant in terms of inputs, high utilization of the network resources and scalabili. Most of today's public cloud solutions (like Amazon EC2 [5] and Microsoft Azure [6]) provide dedicated compute and memory resources (EC2 offers dedicated instances), but they do not provide any dedicated network resources. At best, EC2 provides cluster networking, which ensures that all instances of a tenant are placed in a network that offers high-speed and low- latency. Noisy neighbor [7] is a growing concern in commer- cial cloud systems like EC2. Dedicated server resources ensure that there are no noisy neighbors sharing the same server hardware that can affect each other's performance. However, the network is still shared by different tenants' applications and in the absence of bandwidth reservations, a bandwidth hungry 978-1-4799-3635-9/14/$31.00 ©2014 IEEE application belonging to a tenant may affect the performance of an application belonging to another tenant. Given the shared nature of the data center network, it be- comes important for a tenant to have some network guarantee to estimate the performance of the applications. Static reserva- tions have been proposed before in literature [8], [2]. However, they have not been implemented in real cloud systems like EC2 for three main reasons: 1) End-to-end reservations are hard to enforce in a multi- tenant cloud - one needs to enforce the reservation at the host NIC level and then across the entire path that different flows originating om a tenant's VMs may take. Traditional networking equipment does not provide this type of fine grained flow-level reservations. 2) Making a static reservation may be non-trivial for tenants. It is rare that tenants have a precise idea about how much bandwidth their applications might need. If they oversubscribe, they will end up paying more, and if they under-subscribe their application will face performance problems. 3) Offering static reservation may be equally troublesome for cloud providers - how many bandwidth slabs should be offered to clients? Should tenants be charged at a flat rate for each slab or should it be "pay for what you use" (the general cloud philosophy)? Furthermore, recent studies such as [9], [10], [11], [12] show that there is significant variation with time in bandwidth demands of applications. Therefore, a solution based on static network reservations may result in under-provisioning for peak periods leading to poor performance for the tenant while for non-peak periods, it may result in over-provisioning leading to wastage of resources [13]. There is a need for a network reservation mechanism that dynamically adjusts the bandwidth reservations based on the demand. To overcome these problems, we present Equinox - a system for adaptive end-to-end network reservations for tenants in a cloud environment. Equinox implements adaptive end-to-end network reservations in an OpenStack [14] cloud provisioning system using network control provided by an OpenFlow con- troller. Equinox leverages end-host Open vSwitch [15] based flow monitoring to estimate network bandwidth demands for each VM and automatically provisions that bandwidth end- to-end. It implements reservations at the host level through vswitch based rate limits enforced through OpenStack. To ensure that flows get the required network bandwidth, Equinox routes flows through network paths with sufficient network bandwidth. We also describe practical ways in which Equinox