Cluster Comput DOI 10.1007/s10586-014-0384-x A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems Aftab Ahmed Chandio · Kashif Bilal · Nikos Tziritas · Zhibin Yu · Qingshan Jiang · Samee U. Khan · Cheng-Zhong Xu Received: 11 November 2013 / Revised: 17 March 2014 / Accepted: 18 May 2014 © Springer Science+Business Media New York 2014 Abstract In the large-scale parallel computing environ- ment, resource allocation and energy efficient techniques are required to deliver the quality of services (QoS) and to reduce the operational cost of the system. Because the cost of the energy consumption in the environment is a dominant part of the owner’s and user’sbudget. However, when consider- ing energy efficiency, resource allocation strategies become more difficult, and QoS (i.e., queue time and response time) may violate. This paper therefore is a comparative study on job scheduling in large-scale parallel systems to: (a) mini- mize the queue time, response time, and energy consumption and (b) maximize the overall system utilization. We compare thirteen job scheduling policies to analyze their behavior. A set of job scheduling policies includes (a) priority-based, (b) first fit, (c) backfilling, and (d) window-based policies. All of the policies are extensively simulated and compared. For the simulation, a real data center workload comprised of A. A. Chandio · N. Tziritas · Z. Yu · Q. Jiang · S. U. Khan · C.-Z. Xu Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People’s Republic of China e-mail: aftabac@siat.ac.cn A. A. Chandio Graduate University of Chinese Academy of Sciences, Beijing, People’s Republic of China A. A. Chandio Institute of Mathematics and Computer Science University of Sindh, Jamshoro, Pakistan N. Tziritas e-mail: nikolaos@siat.ac.cn Z. Yu e-mail: zb.yu@siat.ac.cn Q. Jiang e-mail: qs.jiang@siat.ac.cn 22385 jobs is used. Based on results of their performance, we incorporate energy efficiency in three policies i.e., (1) best result producer, (2) average result producer, and (3) worst result producer. We analyze the (a) queue time, (b) response time, (c) slowdown ratio, and (d) energy consumption to evaluate the policies. Moreover, we present a comprehen- sive workload characterization for optimizing system’s per- formance and for scheduler design. Major workload char- acteristics including (a) Narrow, (b) Wide, (c) Short, and (d) Long jobs are characterized for detailed analysis of the schedulers’ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario. Keywords Parallel computing systems · Job scheduling · Workload characterization · Data center · Energy efficiency C.-Z. Xu e-mail: cz.xu@siat.ac.cn K. Bilal · S. U. Khan (B ) Department of Electrical and Computer Engineering, North Dakota State University, Fargo, ND, USA e-mail: samee.khan@ndsu.edu K. Bilal e-mail: kashif.bilal@ndsu.edu C.-Z. Xu Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI, USA 123