Fine-Grained Resource Provisioning and Task Scheduling for Heterogeneous Applications in Distributed Green Clouds Haitao Yuan, Member, IEEE, MengChu Zhou, Fellow, IEEE, Qing Liu, and Abdullah Abusorrah, Senior Member, IEEE Abstract—An increasing number of enterprises have adopted cloud computing to manage their important business applications in distributed green cloud (DGC) systems for low response time and high cost-effectiveness in recent years. Task scheduling and resource allocation in DGCs have gained more attention in both academia and industry as they are costly to manage because of high energy consumption. Many factors in DGCs, e.g. , prices of power grid, and the amount of green energy express strong spatial variations. The dramatic increase of arriving tasks brings a big challenge to minimize the energy cost of a DGC provider in a market where above factors all possess spatial variations. This work adopts a G/ G/1 queuing system to analyze the performance of servers in DGCs. Based on it, a single-objective constrained optimization problem is formulated and solved by a proposed simulated-annealing-based bees algorithm (SBA) to find SBA can minimize the energy cost of a DGC provider by optimally allocating tasks of heterogeneous applications among multiple DGCs, and specifying the running speed of each server and the number of powered-on servers in each GC while strictly meeting response time limits of tasks of all applications. Realistic data- based experimental results prove that SBA achieves lower energy cost than several benchmark scheduling methods do. Index Terms—Bees algorithm, data centers, distributed green cloud (DGC), energy optimization, intelligent optimization, simulated annealing, task scheduling, machine learning. I. Introduction A great deal of attention to providing cloud computing applications is attracted in both academia and industry [1]. Cloud computing has greatly changed the way information technology infrastructure is provided to satisfy various business needs [2]. It allows enterprises to dynamically scale down or up resources according to their actual needs by enabling on-demand infrastructure provisioning [3]. It also realizes significant improvement in mission or business proficiencies without enlarging resource needs. In addition, by supporting a pay-as-you-go service model, it removes initial capital, maintenance and software licensing cost. The trend towards it provides a new paradigm of storage and computing, and has led to the proliferation of data centers [4]. Many famous companies, e.g., Microsoft, Google, Amazon, and Apple have selected this model to provide services more efficiently and quickly to users [5]. 2 One major concern about cloud computing is enormous energy consumption. In two or three years, about 95% of urban data centers would experience total or partial outages that incur annual cost of roughly 2 million US$ per infrastructure [6]. Among them, 28% of these outages would be caused by exceeding the maximum previous grid capacity. Besides the economic concern, the carbon footprint and heat produced by their cooling systems are significantly increasing and they are expected to exceed the airline industry emissions by 2020. According to [7], they consumed about 2.2% of total US. electricity consumption, and originated more than 43 million tons of CO annually. It is predicted that they would consume 140 billion kilowatt-hours annually until 2020. It is shown that the cost for producing all the electricity required by them is more than $7 billion a year. Each large-scale green cloud (GC) usually needs as much energy as 25 000 households on average. With the ever-growing growth in energy consumed by them, the energy optimization has become a major concern in their server provisioning and cooling systems. Resource over-provisioning is a major cause of power inefficiency in data centers because if resources are allocated for the peak need, they are under-utilized in most of the time. For instance, it is reported that the average server utilization is only between 10%−30% percent for them whose considerable capacity is wasted. The main component to the energy consumed by them is infrastructure including servers and other equipment. The power is dominated by the power consumed by enterprise servers, accounting up to 60% of their total energy consumption [8]. Therefore, many have proposed Manuscript received January 28, 2020; revised February 22, 2020; accepted March 17, 2020. This work was supported in part by the National Natural Science Foundation of China (61802015, 61703011), the Major Science and Technology Program for Water Pollution Control and Treatment of China (2018ZX07111005), the National Defense Pre-Research Foundation of China (41401020401, 41401050102) and the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah (D-422-135-1441). Recommended by Associate Editor Peiyun Zhang. (Corresponding author: Haitao Yuan and MengChu Zhou.) Citation: H. T. Yuan, M. C. Zhou, Q. Liu, and A. Abusorrah, “Fine-grained resource provisioning and task scheduling for heterogeneous applications in distributed green clouds,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1380–1393, Sept. 2020. H. T. Yuan, M. C. Zhou, and Q. Liu are with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102 USA (e-mail: haitao.yuan@njit.edu; zhou@njit.edu; qliu@njit.edu). A. Abusorrah is with the Department of Electrical and Computer Engineering, Faculty of Engineering, and the Center of Research Excellence in Renewable Energy and Power Systems, King Abdulaziz University, Jeddah 21589, Saudi Arabia (e-mail: aabusorrah@kau.edu.sa). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JAS.2020.1003177 1380 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 7, NO. 5, SEPTEMBER 2020