1939-1374 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2014.2312912, IEEE Transactions on Services Computing 1 Network Aware Scheduling for Virtual Machine Workloads with Interference Models Sam Verboven, Kurt Vanmechelen and Jan Broeckhove Abstract—Modern data centers use virtualization as a means to increase utilization of increasingly powerful multi-core servers. Applications often require only a fraction of the resources provided by modern hardware. Multiple concurrent workloads are therefore required to achieve adequate utilization levels. Current virtualization solutions allow hardware to be partitioned into Virtual Machines with appropriate isolation on most levels. However, unmanaged consolidation of resource intensive workloads can still lead to unexpected performance variance. Measures are required to avoid or reduce performance interference and provide predictable service levels for all applications. In this paper, we identify and reduce network-related interference effects using performance models based on the runtime character- istics of virtualized workloads. We increase the applicability of existing training data by adding network-related performance metrics and benchmarks. Using the extended set of training data, we predict performance degradation with existing modeling techniques as well as combinations thereof. Application clustering is used to identify several new network-related application types with clearly deﬁned performance proﬁles. Finally, we validate the added value of the improved models by introducing new scheduling techniques and comparing them to previous efforts. We demonstrate how the inclusion of network-related parameters in performance models can signiﬁcantly increase the performance of consolidated workloads. Index Terms—Virtualization, Xen, Proﬁling, Performance Modeling, Support Vector Machines, Scheduling F 1 I NTRODUCTION V IRTUALIZATION has become a widespread technol- ogy used to abstract, combine or divide computing resources in order to allow resource requests to be described and fulﬁlled with minimal dependence on the underlying physical hardware. Using virtualization, an application and its execution environment can be managed as a single entity, a virtual machine (VM) [1], of which the conﬁguration can be captured in a single ﬁle, a virtual machine image. These virtual machine images can be deployed in a hardware-agnostic manner on any hardware that hosts a compatible hypervisor or Virtual Machine Monitor. The hypervisor is a software component that hosts virtual machines, also referred to as guests. This software layer abstracts the physical resources from the virtual machines by providing a vir- tual processor and other virtualized versions of system devices such as I/O devices, storage, memory, etc. [2], [3]. Hypervisors thereby offer ﬂexibility in partitioning the underlying hardware and ensure some degree of isolation between the different virtual machines sharing these resources. Although the hypervisor provides adequate isolation on many levels (e.g. security, faults, ...) performance inter- ference can still be an issue, particularly with resource intensive workloads [4]. Each virtual machine is allo- cated a subset of the available resources and requires • S. Verboven, K. Vanmechelen and J. Broeckhove are associated with the Department of Mathematics and Computer Science, University of Antwerp, 2020 Antwerp, Middelheimlaan 1 E-mail: sam.verboven@uantwerpen.ac.be the hypervisor’s cooperation to complete certain tasks (e.g. disk or network I/O). When multiple VMs share the same hardware, bottlenecks can occur both on the hard- ware as well as the hypervisor level. Resource contention problems will likely become even more important in the future as more VMs share the same hardware in an effort to increase utilization. The evolution towards an ever increasing amount of CPU cores per server and relatively slower gain in I/O performance further highlight the need to proactively address performance interference. Improvements in VM scheduling and the reduction of virtualization overhead can partially mitigate these is- sues. In the context of data centers, where VMs can be easily migrated between hosts, the problem can be addressed at a higher level through intelligent schedul- ing. This approach requires more insight in the resource consumption of individual workloads and their impact on other workloads. By identifying potential sources of performance interference, more informed scheduling decisions can be made that reduce the impact of resource contention. Thereby allowing more VMs to be collocated on the same hardware while providing more predictable performance. In previous work [5], we improved application perfor- mance using interference aware scheduling techniques based on slowdown prediction models and application clas- siﬁcation. We demonstrated that these models can be used to greatly reduce interference effects and improve overall performance. However, the proﬁled system-level characteristics were limited to CPU, cache and disk metrics. In this paper, we extend the proposed approach to identify and reduce performance interference using network-oriented applications.