(c) 2003 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by a contractor or affiliate of the [U.S.] Government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. SC'03, November 15-21, 2003, Phoenix, Arizona, USA Copyright 2003 ACM 1-58113-695-1/03/0011...$5.00 1 Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System June 30, 2003 Terry Jones, Shawn Dawson, Rob Neely William Tuel, Larry Brenner, Jeffrey Fier, Robert Blackmore, Patrick Caffrey Brian Maskell, Paul Tomlinson, Mark Roberts Lawrence Livermore National Laboratory Livermore, CA, USA 94550 International Business Machines Corporation Armonk, NY, USA 10504 Atomic Weapons Establishment Aldermaston Reading, UK RG7 4PR Abstract A parallel application benefits from scheduling policies that include a global perspective of the application’s process working set. As the interactions among cooperating processes increase, mechanisms to ameliorate waiting within one or more of the processes become more important. In particular, collective operations such as barriers and reductions are extremely sensitive to even usually harmless events such as context switches among members of the process working set. For the last 18 months, we have been researching the impact of random short-lived interruptions such as timer-decrement processing and periodic daemon activity, and developing strategies to minimize their impact on large processor-count SPMD bulk-synchronous programming styles. We present a novel co-scheduling scheme for improving performance of fine-grain collective activities such as barriers and reductions, describe an implementation consisting of operating system kernel modifications and run-time system, and present a set of empirical results comparing the technique with traditional operating system scheduling. Our results indicate a speedup of over 300% on synchronizing collectives. 1. Introduction Traditional operating systems based on UNIX® and its variants (including AIX® and Linux) have serious deficiencies for large-scale parallel environments such as those of interest at national laboratories and supercomputing centers. This is largely a historical artifact: when UNIX was developed over 30