1 Autonomic Resource Management for Virtualized Database Hosting Systems Lixi Wang * , Jing Xu , Ming Zhao * , Yicheng Tu ‡ , Jose Fortes * Florida International University, {lwang007, mzhao}@fiu.edu University of Florida, {jxu, fortes}@ufl.edu ‡ University of South Florida, ytu@cse.usf.edu Abstract— The hosting of databases on virtual machines (VMs) has great potential to improve the efficiency of resource utilization and the ease of deployment of database systems. This paper considers the problem of allocation of physical resources on demand to a database’s VM according to QoS (Quality of Service) requirements. This is a challenging problem because of the highly dynamic and complex nature of database systems and their workloads. An autonomic resource management approach is proposed to address this problem based on adaptive fuzzy modeling and prediction techniques. The approach can effectively capture the relationship between a dynamically changing database workload, which is both CPU and I/O intensive, and its VM’s consumption of resources, including both CPU cycles and disk bandwidth. It can be used to predict the resource needs of a database VM online and to guide the on-demand resource allocation according to the workload demand and desired QoS. A prototype of the proposed resource management system is evaluated using typical database workloads based on TPC-H and RUBiS. The results demonstrate that the proposed approach can efficiently allocate resources for a database VM that is serving CPU and I/O intensive queries while meeting the QoS targets. I. INTRODUCTION A system-level virtual machine (VM) (e.g., VMware [1], Xen [2]) can be a powerful platform for deploying and hosting database systems. From the perspective of database users, VMs enable fine-tuned databases to be encapsulated along with their execution environments and conveniently deployed as appliances on different hosting systems. From the perspective of resource owners, VMs allow flexible resource allocation to meet changing database needs and efficient resource utilization by sharing resources between databases and other applications. However, although many important applications, such as Web and application servers, have been widely deployed on VMs, efficient hosting of databases on virtualized resources is still very challenging due to the highly complex and dynamic nature of database systems and their workloads. Typical databases have to serve dynamically changing workloads consisting of a wide variety of queries, whereas the query executions can consume different types and amounts of resources, including both CPU and I/O. These properties make it difficult to host databases on shared resources without compromising performance or wasting resources. This paper aims to address the above challenges through an autonomic VM resource management system that can automatically control and optimize the allocations of different types of resources to database VMs based on their workload demands and QoS (Quality of Service) objectives. The fundamental goal of this proposed system is two-fold. First, it should be able to automatically learn a database VM’s needs for multi-type resources to service a complex query workload so that resources can be efficiently allocated to the VM while satisfying the desired query QoS. Second, it should be able to automatically adapt to the dynamic changes of a database VM’s resource usage and timely adjust the VM’s resource allocations to maintain both the efficiency of resource usages and the QoS of queries. To realize the above stated goals, this paper proposes a fuzzy-modeling based online learning and prediction approach to the autonomic resource management of virtualized database hosting systems. In this approach, fuzzy- logic based modeling is adopted to automatically learn the resource usage behaviors of database VMs based on observed query workload characteristics and VM resource consumptions. This modeling method does not require any a priori knowledge of the system’s internal structure and it can efficiently describe complex and nonlinear system behaviors. Specifically, a database VM’s resource model is constructed online and updated dynamically to learn the relationship between a query workload’s changing characteristics and the VM’s needs of multi-type resources, particularly CPU cycles and I/O bandwidth. This model is then applied also online to predict the database VM’s multi-type resource needs for its current workload and to allocate resources efficiently to the VM and meanwhile meet the QoS target for the queries. This resource management system is implemented for Xen-based VM environments and it is evaluated using a series of experiments based on typical database benchmarks (TPC-H [3], RUBiS [4]). The results demonstrate that the system can efficiently allocate resources for a database VM that is serving CPU and I/O intensive queries while still delivering the same level of performance as when all the resources are dedicated to the VM. The results also show that the system can adapt to dynamic transitions of the database VM’s resource usage caused by changing workload intensity and composition, achieving both resource efficiency and query QoS in a timely manner. In summary, this paper has made the following unique contributions: 1) It proposes a novel autonomic resource management system for database VMs, which can efficiently allocate different types of resources according to the query workload demand and can timely adapt to changes in their resource usage behaviors; 2) It develops an implementation for typical Xen-based VM systems, which can manage and optimize the use of both CPU cycles and I/O bandwidth for database VMs serving resource-intensive workloads; 3) The overall approach proposed in this paper is also generally