Greedy Scheduling with Complex Objectives Carsten Franke, Joachim Lepping, and Uwe Schwiegelshohn Abstract—We present a methodology for automatically gener- ating an online scheduling process for an arbitrary objective with the help of Evolution Strategies. The scheduling problem com- prises independent parallel jobs and multiple identical machines and occurs in many real Massively Parallel Processing systems. The system owner defines the objective that may consider job waiting times and priorities of user groups. Our scheduling process is a variant of the simple and commonly used Greedy scheduling algorithm in combination with a repeated sorting of the waiting queue. This sorting uses a criterion whose parameters are evolutionary optimized. We evaluate our new scheduling process with real workload data and compare it to the best offline solutions and to the online results of the standard EASY backfill algorithm. To this end, we partition the user of the workloads into groups and select an exemplary objective that prioritizes some of those groups over others. I. I NTRODUCTION In this paper, we use Evolution Strategies to generate a method that specifically considers the preferences of the sys- tem owner when automatically generating an online scheduling process for Massively Parallel Processing (MPP) systems. Our scheduling problem is taken from real MPP installations: Different user submit independent, non-clairvoyant parallel jobs to the MPP system over time. The scheduling process is responsible to assign those jobs to the available identical machines of the MPP system. Because of limited available processing time, existing online scheduling processes at real installations mainly use Greedy scheduling [1] in combination with First-Come-First-Serve and Backfilling [2] methods. Although those algorithms produce a very low utilization in the worst case [3] they work well and fast in practice [4]. Other scheduling algorithms have been subject of theoretical and simulation studies but are rarely found at real installations, see Feitelson et al. [5] and the references therein, as they often require more execution time. Therefore, our scheduling process is also based on a Greedy approach. However, the parameters are carefully chosen to consider the scheduling criterion and the workload. To this end, we use Evolution Strategies that are introduced in Section II-B. Manuscript received on October 31, 2006. This work was financially supported by the Deutsche Forschungsgemeinschaft (DFG) as a subproject of the Collaborative Research Center 531, ”Computational Intelligence”, at the Dortmund University. Carsten Franke is a former member of the Robotics Research Institute at Dortmund University and is now with SAP Research CEC Belfast, TEIC Building, University of Ulster, Newtownabbey BT37 0QB, UK (email: carsten.franke@sap.com). Joachim Lepping and Uwe Schwiegelshohn are with the Robotics Research Institute, Section Information Tech- nology (IRF-IT), 44221 Dortmund, Germany (email: {joachim.lepping, uwe.schwiegelshohn}@udo.edu). In real-life, system owners have different relationships to the various users or user groups of their systems. Those rela- tionships lead to different priorities of the users and their jobs. This is particularly true as MPP systems increasingly become part of Computational Grids nowadays [6], that is, low priority users from other sites request system resources as well. Due to the existence of different user priorities that may change over time, there is a growing need for scheduling systems that can flexibly consider those priorities without reducing overall system utilization. As existing scheduling strategies are not suited to satisfactorily handle those priorities, system owners often set partitions or quotas to prevent lower priority groups from occupying too many resources. However, those restrictions tend to reduce machine utilization significantly [7]. We model an MPP system as m identical parallel machines. This model closely matches reality as differences between the nodes of an MPP system usually are not significant. Job scheduling on MPP systems is an online problem as jobs are submitted over time and the processing time p j of job j is not available at the release date r j . However system administrators often require users to provide estimates ¯ p j of the processing time p j to determine faulty jobs whose processing times exceed a rather high estimate. Therefore, the estimates are not really reliable [8]. Nevertheless, they are used for scheduling with backfilling as no other data are available [6]. Further, many parallel jobs on MPP systems are not moldable or malleable [5], that is, they need concurrent and exclusive access to m j m machines during their whole execution. The user provides the value m j at the release date r j of the job. The completion time of job j in schedule S is denoted by C j (S). As preemption [5] is not allowed in many MPP systems, each job starts its execution at time C j (S) - p j . Our work is based on workload traces of real installations. Such workload data include all hidden job dependencies, patterns, and feedback mechanisms. Several workloads of MPP systems are publicly available, see the standard workload archive maintained by Feitelson [9]. Unfortunately, they only partly provide user group information while owner priorities are missing completely. Therefore, this information is added to the workloads as explained in Section IV-A. As already mentioned common scheduling algorithms only support stan- dard scheduling criteria like utilization or average waiting time [5]. To demonstrate the ability of our method to support unconventional criteria we define one in Section IV-B. Finally, we use two approaches to evaluate the results of our scheduling process: 1) Comparison with an approximation of the optimal offline result 2) Comparison with the result of a standard online schedul- 113 Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Scheduling (CI-Sched 2007) 1-4244-0704-4/07/$20.00 ©2007 IEEE