Scheduling vs Communication in PELCR Marco Pedicini 1 and Francesco Quaglia 2 1 IAC, Consiglio Nazionale delle Ricerche, Roma, Italy 2 DIS, Universit`a “La Sapienza”, Roma, Italy Abstract. PELCR is an environment for λ-terms reduction on paral- lel/distributed computing systems. The computation performed in this environment is a distributed graph rewriting and a major optimization to achieve efficient execution consists of a message aggregation technique exhibiting the potential for strong reduction of the communication over- head. In this paper we discuss the interaction between the effectiveness of aggregation and the schedule sequence of rewriting operations. Then wepresentaPriorityBased(BP)schedulingalgorithmwellsuitedforthe specific aggregation technique. Results on a classical benchmark λ-term demonstrate that PB allows PELCR to achieve up to 88% of the ideal speedup while executing on a shared memory parallel architecture. 1 Introduction PELCR (Parallel Environment for Lambda-Calculus Reduction) is a recent soft- ware [7] for efficient optimal reduction of λ-terms on parallel/distributed com- puting systems. The development of this software is based on results in the field of functional programming, which have shown how the reduction of λ-terms can be mapped onto a particular graph rewriting technique known as Directed Vir- tual Reduction (DVR) [1,2,3,4,5]. In DVR, each computational step corresponds to a transition from a graph G to a graph G obtained through the composition of two labeled edges, say e and e ′′ , insisting on a node v. Such a composition has the following effects: (i) a new node v is created with two exiting labeled edges that point to the source nodes of e and e ′′ respectively, (ii) the labels of e and e ′′ are modified to reflect that the two edges must not be composed anymore. PELCR allows edge compositions to be performed concurrently by support- ing the graph distribution among multiple machines. As respect to this point, the nodes dynamically originated by DVR steps are distributed according to a load balancing mechanism able to prevent overload on any machine. An additional optimization embedded by PELCR deals with communication, implemented in the form of message exchange based on the MPI layer. More pre- cisely, PELCR adopts a message aggregation technique that collects application messages destined to the same machine (each of those messages notifies the exis- tence of a new edge in the graph), and delivers them using a single MPI message. The advantage of aggregation is in the reduction of the network path setup time, due to the reduction in the amount of MPI messages. On the other hand, ag- gregation delays the delivery of application messages since they are not sent as B. Monien and R. Feldmann (Eds.): Euro-Par 2002, LNCS 2400, pp. 648–655. c Springer-Verlag Berlin Heidelberg 2002