The E-BSP Model: Incorporating General Locality and Unbalanced Communication into the BSP Model Ben H.H. Juurlink and Harry A.G. Wijshoff High Performance Computing Division, Department of Computer Science Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands benj,harryw @cs.leidenuniv.nl Abstract. The BSP model was proposed as a step towards general purpose paral- lel computing. This paper introduces the E- BSP model that extends the BSP model in two ways. First, it provides a way to deal with unbalanced communication pat- terns, i.e., communication patterns in which the amount of data sent or received by each processor is different. Second, it adds a notion of general locality to the BSP model where the delay of a remote memory access depends on the relative location of the processors in the interconnection network. We use our model to develop several algorithms that improve upon algorithms derived under the BSP model. 1 Introduction It has been stressed by many authors that the emergence of one or a few computational models is essential to the progress of parallel computing [9, 14], because it enables the programmer to write architecture-independent software. Such a model should strike a balance between simplicity of usage and reflectivity of existing parallel architectures. Despite many efforts, no consensus has been reached on which model should be used. One model that has gained some acceptance is the Bulk-Synchronous Parallel ( BSP) model proposed by Valiant [15]. A BSP computer consists of a number of processors connected by a point-to-point message router. Furthermore, it includes a mechanism to barrier synchronize the nodes. Computations on the BSP model are organized in a series of supersteps, with synchro- nization taking place between supersteps. The performance of a BSP computer depends on three parameters: the number of processors , the synchronization cost / commu- nication latency and the computational to communication throughput ratio . Typical values for and are for meshes, and for hypercubes, and for hypercube-derivative networks such as butter- flies. This paper introduces the Extended BSP or E- BSP model that extends the BSP model in two ways. First, it provides a method to deal with unbalanced communication pat- terns. Under the BSP model, the cost of communication depends on the largest amount of data sent or received by any processor. In many situations, this will be a large over- estimate. Consider for example a personalized broadcast in which one processor sends distinct items, each to a different destination. Under the BSP model, the cost of this operation is . However, it can be seen that on any network that contains a Hamiltonian path a personalized broadcast can be implemented in time. The second extension we propose adds a notion of general locality (or network prox- imity) to the BSP model. The BSP model only distinguishes between local and non-local