Moldable Parallel Job Scheduling Using Job Efficiency: An Iterative Approach Gerald Sabin, Matthew Lang, and P Sadayappan Dept. of Computer Science and Engineering, The Ohio State University, Columbus OH 43201, USA, {sabin, langma, saday}@cse.ohio-state.edu Abstract. Currently, job schedulers require “rigid” job submissions from users, who must specify a particular number of processors for each paral- lel job. Most parallel jobs can be run on different processor partition sizes, but there is often a trade-off between wait-time and run-time — asking for many processors reduces run-time but may require a protracted wait. With moldable scheduling, the choice of job partition size is determined by the scheduler, using information about job scalability characteristics. We explore the role of job efficiency in moldable scheduling, through the development of a scheduling scheme that utilizes job efficiency infor- mation. The algorithm is able to improve the average turnaround time, but requires tuning of parameters. Using this exploration as motivation, we then develop an iterative scheme that avoids the need for any pa- rameter tuning. The iterative scheme performs an intelligent, heuristic based search for a schedule that minimizes average turnaround time. It is shown to perform better than other recently proposed moldable job scheduling schemes, with good response times for both the small and large jobs, when evaluated with different workloads. 1 Introduction Parallel job scheduling in a space-shared environment[1–5] is a research topic that has received a large amount of attention. Traditional approaches to job scheduling operate under the principle that jobs are rigid — that they are sub- mitted to run on a certain number of processors, and that number is inflexible. Previously considered rigid scheduling schemes range from an early and simple first-come-first-serve (FCFS) strategy, which suffers from severe fragmentation and leads to poor utilization, to current backfilling policies which attempt to reduce the number of wasted cycles. Backfilling creates reservations for N jobs from a sorted queue (often based on arrival time, job size, or current wait time), and then allow jobs to start “out of order” provided that no reservations are violated. Variations of N , such as N = 1 (aggressive or EASY backfilling) or N = (conservative backfilling) exhibit different behaviors and have been stud- ied in detail. The vast majority of this work assumes that the user provides the number of nodes the job must run on as well as the job’s estimated runtime. However, many jobs do not actually require a specific number of processors; they can run on a range of processors. This range may be limited by constraints