CPU Utilization while Scaling Resources in the Cloud Marjan Gusev, Sasko Ristov, Monika Simjanoska, and Goran Velkoski Faculty of Information Sciences and Computer Engineering Ss. Cyril and Methodius University Skopje, Macedonia Email: marjan.gushev@finki.ukim.mk, sashko.ristov@finki.ukim.mk, m.simjanoska@gmail.com, velkoski.goran@gmail.com Abstract—CPU utilization in a virtual machine instance directly impacts the overall cost for the cloud service provider since it generates costs for power consumption and cooling. We are interested to determine the total CPU utilization behavior while scaling the number of CPU cores using the same server load. The experiments are based on two simple web services to utilize the virtual machine instance varying the number of concurrent messages and their size. The goal is to check if the total CPU utilization while scaling will be sublinear (smaller than the number of cores), and if it is greater than the CPU utilization when executed without scaling (using only one CPU core) due to task scheduling, coherence, etc. The experiments prove only that the total CPU utilization will be sublinear. We observe a region (workload with smaller number of concurrent messages) where the total CPU utilization decreases while scaling, compared to the case without scaling. We also determine the correlation between the CPU utilization with message size and the number of concurrent messages. Keywords-Cloud Computing; Performance; Web Services; Web Server. I. I NTRODUCTION Cloud computing is a recent technological trend in which resources, such as CPU and storage, are provided as general utilities that can be leased and released on-demand by users according to their requirements [1]. The cloud is a promising approach for delivering ICT services by improving the uti- lization of data centre resources [2]. Scalability and elasticity are quality features in the cloud, since the cloud adjusts itself to achieve better performance whenever it detects a change in the environment [3]. Scaling the performance for growing problem size is an imperative [4], [5]. However, the resulting performance is not always acceptable for all applications hosted in the cloud [6]. While the cloud customer cost depends on the resources leased time, the cloud service provider cost mostly de- pends on CPU utilization of the active (leased) resources. That is, greater CPU utilization will increase not only the cost for power electricity, but also for cooling. Activating and utilizing more computing resources will increase the monthly costs of cloud data-center (approx.40% of costs are generated by power electricity and cooling). Reallocation of virtual machines and switching off the idle servers will save substantial energy [7]. Optimal resource allocation can improve the performance using the same resources in the cloud [8]. Saleh et al. [9] have demonstrated that using some CPU utilization threshold to autoscale the resources is not an accurate measure since it can provide high cost and poor resource utilization. Scaling the resources will reduce the CPU utilization per core, but we are interested if total CPU utilization will be also reduced or increased. We have set two hypotheses which we would like to check: H1 the total CPU utilization while scaling is sublinear (smaller than the number of cores); and H2 the total CPU utilization while scaling is greater than the CPU utilization when executed without scaling due to task scheduling, coherence, etc. That is, we expect that the total CPU utilization will be in the range of (U 1 ,U 1 · n), where U 1 denotes the CPU utilization of virtual machine instance with one CPU allocated. We realize several experiments to find the behavior of CPU utilization when scaling is applied, i.e., more powerful virtual machine instances (using more processor cores) are activated. The experiments are based on measurement of the CPU utilization while scaling from 1 to 2 and 4 CPU cores in a virtual machine instance. We use two simple web services to load the web server in virtual machine instances, i.e., Concat and Sort. The former concats two strings and the latter sorts the concatenation of two input strings. Both are memory demanding, and the second is also computationally intensive. We analyze the CPU utilization by varying the server load with different number of concurrent messages and input string size. The rest of the paper is organized as follows. Related work is presented in Section II. In Section III, we describe the methodology used for testing. The experiments and the results are discussed in sections IV and V. In Section VI, we derive conclusion and we present future work. II. RELATED WORK Several papers analyzed CPU utilization on-premise and in the cloud, while loaded the same web services with various number of concurrent messages and message size. Gusev et al. [10] determined that the number of concurrent messages impacts directly to the CPU utilization for memory 131 Copyright (c) IARIA, 2013. ISBN: 978-1-61208-271-4 CLOUD COMPUTING 2013 : The Fourth International Conference on Cloud Computing, GRIDs, and Virtualization