Fast Measurement of LogP Parameters for Message Passing Platforms Thilo Kielmann, Henri E. Bal, and Kees Verstoep Department of Computer Science, Vrije Universiteit, Amsterdam, The Netherlands kielmann@cs.vu.nl bal@cs.vu.nl versto@cs.vu.nl Abstract. Performance modeling is important for implementing efficient par- allel applications and runtime systems. The LogP model captures the relevant aspects of message passing in distributed-memory architectures. In this paper we describe an efficient method that measures LogP parameters for a given message passing platform. Measurements are performed for messages of different sizes, as covered by the parameterized LogP model, a slight extension of LogP and LogGP. To minimize both intrusiveness and completion time of the measurement, we propose a procedure that sends as few messages as possible. An implemen- tation of this procedure, called the MPI LogP benchmark, is available from our WWW site. 1 Introduction Performance modeling is important for implementing efficient parallel applications and runtime systems. For example, application-level schedulers (AppLeS) [2] aim to min- imize application runtime based on application-specific performance models (e.g., for completion times of given subtasks) which are parameterized by dynamic resource per- formance characteristics of CPUs and networks. An AppLeS may, for example, de- termine suitable data distributions and task assignments based on the knowledge of message transfer times and computation completion times. Another example for the use of performance models is our MagPIe library [8, 9] which optimizes MPI’s collective communication. Based on a model for the completion times of message sending and receiving, it optimizes communication graphs (e.g., for broadcast and scatter) and finds suitable segment sizes for splitting large messages in order to minimize collective completion time. The LogP model [4] captures the relevant aspects of message passing in distributed- memory systems. It defines the number of processors , the network latency , and the time (overhead) a processor spends sending or receiving a message. In addition, it defines the gap as the minimum time interval between consecutive message trans- missions or receptions at a processor, which is the reciprocal value of achievable end- to-end bandwidth. Because LogP is intended for short messages, and are constant. The LogGP model extends LogP to also cover long messages [1]. It adds a parame- ter for modeling the gap per byte for long messages, which are typically handled more efficiently. Other variants of LogP have also been proposed where the overhead at the sender and the receiver side is treated separately as and , and where some parameters depend on the message size [5, 7, 8].