Networking-Computing resource allocation for Hard Real-Time Green Cloud applications N. Cordeschi Sapienza University of Rome nicola.cordeschi@uniroma1.it D. Amendola Sapienza University of Rome danilo.amendola@uniroma1.it F. De Rango University of Calabria derango@dimes.unical.it E. Baccarelli Sapienza University of Rome enzo.baccarelli@uniroma1.it Abstract—Performing real-time applications on top of vir- tualized cloud systems requires that the overall per-job delay due to the in-cloud processing is upper bounded in a hard way. This opens the question about the optimal dynamic joint allocation of both computing and networking resources hosted in the Cloud. This is the focus of this contribution, where we develop in closed-form the optimal fully scalable energy-saving scheduler for the joint allocation of the task sizes, communication rates and processing rates in delay-constrained Clouds composed by multiple frequency-scalable parallel Virtual Machines (VMs). Keywords—Green Cloud, Computing-communication resources, Hard real-time applications, Dynamic Voltage and Frequency Scaling (DVFS). I. I NTRODUCTION AND CLOUD ARCHITECTURE The goal of the Green Cloud Computing is to develop models and techniques for the integrated management of computing-communication virtualized platforms, so as to pro- vide QoS, robustness and energy efficiency. The resulting chal- lenge is to minimize the energy usage and still meet the QoS requirements of the supported applications. About the QoS support, an energy-saving joint allocation of both networking and computing resources hosted in the Cloud is needed. This is the focus of this paper, where the contrasting objectives of minimizing both networking and computing energies in real-time applications running on top of virtualized Clouds are cast in the form of a suitable constrained optimization problem. DVFS-enabled dynamic power management schemes for cluster-based embedded computing platforms are the focus of [12], [13], [15]. Although hard deadline constraints are explicitly considered in these works, job decomposition is not considered in these contributions, and the on-line implemen- tation complexity of the therein proposed schedulers grows at least as O (M log(M )), where M is the minimum among the number of tasks to be executed and the number of the available parallel processors. Energy saving management of the computing and communication resources is the specific topic of [?], [16]–[21]. Overall, all these contributions on Cloud Computing do not account for the energy consuming profiles of the underlying networking infrastructures. About the architecture, a cluster platform for parallel computing is composed by multiple processing units and a central resource controller [1]. Each processing unit executes the currently assigned task as an independent processor by self-managing own local storage/computing resources. A new job is initiated by an event, that is constituted by the arrival of a file of size L t (bit) to be processed by the Cloud. Due to the real-time nature of the considered application scenario, full processing of the input file must be carried out within a given (e.g., a priori assigned) processing time T t (sec). Hence, in our framework, a real-time job may be suitable characterized by [7] : i) the size L t (bit) of the file to be processed; ii) the maximum tolerated processing delay T t (sec); and, iii) its granularity, that is, the (integer-valued) maximum number M T ≥ 1 of independent parallel tasks we may decompose the submitted job [7, Sect.2.4]. Let M V ≥ 1 be the (integer- valued) maximum number of VMs that may be instantiated onto the Cloud. In principle, each VM may be modeled as a (virtual) server, that is capable to process f c bits per second [11]. f c may be adaptively adjusted and it may assume values over the interval [0,f max c ] , where f max c (bit/sec) is the maximum allowed processing rate. Furthermore, due to the real-time nature of the considered application scenario, the time allowed the VM to fully process each submitted task is fixed in advance at Δ(sec), regardless from the actual size L of the task currently submitted to the VM. Hence, by definition, the utilization factor η of the VM equates [9]: η f c /f max c ∈ [0, 1]. In order to characterize the energy consumption of a VM, let E c ≡E c (f c )(Joule) be the overall energy consumed by the VM to process a single task of duration Δ(sec.), when the VM works at the processing rate f c , and let E max c E c (f max c )(Joule) be the maximum energy required by the VM to perform a single task of duration Δ(sec.) when it operates at the maximum processing rate f max c . Hence, by definition, the ratio: Φ(η) E c (f c )/E max c ≡ Φ(f c /f max c ) is the so-called Normalized Energy Consumption of the considered VM [9]. At least for CMOS-based physical CPUs, Φ(η) is strictly increasing and strictly convex in η. In practice, the form assumed by Φ(η) for DVFS-enabled CMOS CPUs is recognized to be well approximate by the following quadratic one [10], [12]: Φ(η)= η 2 ,η ∈ [0, 1]. Let M min {M V ,M T } be the degree of concurrency of the submitted job (that is, the number of not overlapping tasks that can be executed in parallel for carrying out the job; [7, Sect.2.4]), and let L i (bit) be the size of the task currently submitted to the VM(i). Let L t (bit) the overall size of the job currently submitted to the Cloud, and let L i ≥ 0,i =1,...,M , the size of the task that the Scheduler assigns to VM(i). Hence, the following constraint: ∑ M i=1 L i = L t guarantees that the overall job L t is partitioned into (at the most) M parallel tasks. After the scheduling phase ending, VM(i) will be forced to process the assigned task of size L i within Δ(i) secs. In order to take at a minimum the transmission delays from (to) the Scheduler to (from) the connected VMs, as suggested in [2], [8], we assume that each VM communicates to the Scheduler via a dedicated (i.e., contention free) reli- able link that work at the transmission rate of C i (bit/sec), i =1,...,M . Specifically, we assume that the i-th link is bidirectional, symmetric and operates in a half-duplex way [8]. Furthermore, we also assume that the one-way transmission- plus-switching operation over the i-th link drains a (fixed)