The Implementation of the BSP Parallel Computing Model on the InteGrade Grid Middleware * Andrei Goldchleger, Alfredo Goldman, Ulisses Hayashida, Fabio Kon Department of Computer Science University of S˜ ao Paulo, Brazil {andgold,gold,ulisses,kon}@ime.usp.br http://gsd.ime.usp.br/integrade ABSTRACT InteGrade is an object-oriented grid middleware infrastruc- ture whose goal is to leverage existing computational re- sources in organizations. Rather than relying on dedicated hardware such as reserved clusters, InteGrade focuses on using desktops in users’ offices, machines in computer labo- ratories, shared workstations, as well as dedicated clusters. In this paper, we describe the support for the execution of highly coupled parallel applications on top of InteGrade. The paper describes the implementation of the middleware to support BSP parallel applications (with global synchro- nization points), and presents experimental results. Categories and Subject Descriptors C.2.4 [Computer-Communication Networks]: Distributed Systems - Distributed applications; D.1.3 [D.1 Program- ming Techniques]: Concurrent Programming - Parallel Programming General Terms Parallel Computing Library, Performance Keywords BSP, Parallel Computing, Grid Computing 1. INTRODUCTION InteGrade [9] is a Grid Computing system aimed at com- modity workstations such as household PCs, corporate em- ployee workstations, and PCs in shared laboratories. It uses the idle computing power of these machines to perform use- ful computation. Our goal is to allow organizations to use * This work is supported by a grant from CNPq, Brazil, pro- cess #55.0094/2005-9. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MGC’05, November 28- December 2, 2005 Grenoble, France Copyright 2005 ACM 1-59593-269-0/05/11 ...$5.00. their existing computing infrastructure to perform useful computation, without requiring the purchase of additional hardware. Moreover, users who share the idle portion of their resources should have their quality of service preserved by the InteGrade middleware. In spite of the great computing power available today in most organizations in the form of desktop PCs, there are still difficulties in using the idle cycles of these machines for use- ful computation. To solve this, we implemented support for distributing and executing two different kinds of parallel ap- plications. First, we extended the interface of InteGrade to support parametric applications in which there is no commu- nication among application nodes. This kind of application, included in the bag-of-tasks class, is currently supported by other grid middleware such as OurGrid (www.ourgrid.org) and BOINC [2], on non-dedicated machines. Second, we im- plemented a modern parallel computing model (Bulk Syn- chronous Parallel (BSP) [24, 19]) to support applications whose nodes do communicate with each other, i.e., highly- coupled parallel applications. The BSP reference implemen- tation is University of Oxford’s BSPlib [22]. The BSPlib core library is simple and is composed of only 20 functions. When compared to PVM [21] and MPI [8], two popular par- allel computing libraries, BSP offers a much more elegant computing model and simpler programming library. Within BSP, we have global synchronization points among the processes of a parallel application. Using this synchro- nization points the BSP applications can be better adapted to an environment subject to frequent changes such as the Grid. The BSP synchronization points greatly facilitates the implementation of checkpointing to permit recovery in the presence of failures, which are very common in Opportunis- tic Grid Computing. Also, using checkpointing, the BSP parallel applications can use a larger, or smaller number of processors, expanding or shrinking dynamically, adapting to the Grid resource availability. In this paper, we discuss the implementation of the BSP model on top of the InteGrade grid middleware, using its dis- tributed scheduling and allocation services. The structure of the paper is as follows. Section 2 discusses support for parallel applications in other grid platforms and Section 3 describes the major concepts behind BSP and BSPlib. Sec- tion 4 presents a brief description of the InteGrade system and architecture. Section 5 focuses on our implementation of the BSP model. We present our conclusions in Section 6.