Graph Partitioning for Parallel Applications in Heterogeneous Grid Environments Shailendra Kumar and Sajal K. Das Dept. of Computer Science & Engineering The University of Texas at Arlington Arlington, TX 76019-0015 skumar,das @cse.uta.edu Rupak Biswas NASA Advanced Supercomputing Division NASA Ames Research Center Moffett Field, CA 94035-1000 rbiswas@nas.nasa.gov Abstract The problem of partitioning irregular graphs for paral- lel computations on homogeneous systems has been exten- sively studied. However, these solutions fail when the tar- get system architecture exhibits heterogeneity in resource characteristics. With the emergence of technologies such as the Grid, it is imperative to study the partitioning problem in the context of distributed heterogeneous systems. In our Grid model, the system consists of processors with varying computational power that are connected via a non-uniform communication network. We present a novel multilevel par- titioning algorithm, called MiniMax, for irregular graphs that takes into account issues pertinent to Grid computing environments. The proposed scheme generates and maps partitions onto a heterogeneous system with the objective of minimizing the maximum execution time of the parallel dis- tributed application. Simulation results for both synthetic and real workloads demonstrate that MiniMax generates high quality partitions for various classes of applications targeted for parallel execution in a distributed heteroge- neous environment. 1. Introduction The popularity of Grid infrastructures [2] has stirred ac- tive research in the arena of harnessing computational ca- pacity from distributed heterogeneous systems. The Grid exploits existing systems to solve problems efficiently in a cost-effective manner as opposed to replacing these systems with yet powerful and expensive machines. It provides a metacomputing platform promising computational power of magnitude never anticipated before at affordable cost; how- ever, it comes with challenges for the research community This work was supported by NASA Ames Research Center under Co- operative Agreement Number NCC 2-5395. to successfully map varied applications onto such heteroge- neous collections of high-performance systems. The Information Power Grid (IPG) [7], NASA’s venture into the Grid computing arena, aims to provide scientific and engineering communities orders of magnitude increases in their ability to solve problems that depend on the use of large-scale dispersed resources: aggregated computing, diverse data archives, remote laboratory instruments, scat- tered engineering test facilities, and geographically sepa- rated human collaborators. However, the scope of this work is limited only to the parallel and distributed solution of large-scale computational problems (also known as aggre- gated computing). The successful deployment of computation-intensive ap- plications in a Grid environment such as the IPG involves efficient partitioning and load balancing on a truly heteroge- neous distributed metasystem without making any assump- tions about the resources that comprise it. As more diverse resources are added, the Grid scales to larger capability and greater heterogeneity. The interconnects could range from a very high speed optical link between the Grid nodes to rela- tively slow intra-node communication for a loosely-coupled network of workstations serving as a single node. The re- verse is also possible where inter-node communication is slower than intra-node links. However, there are no con- straints on the network topology connecting these Grid re- sources. Given that the processors themselves have varying computational power, we have a purely heterogeneous sys- tem model with varying system characteristics. Multilevel algorithms like Chaco [5], Metis [9], and Jos- tle [13] have used similar approaches to solving the graph partitioning problem for homogeneous systems. Conse- quently, such partitioners fail to address the limitations im- posed by heterogeneity in the underlying system model. Furthermore, they do not delve into the mapping problem of assigning partitions to processors so as to reduce application execution time, which is the ultimate objective. Above all, Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS02) 1530-2075/02 $17.00 ' 2002 IEEE