Designing parallel systems: a performance prediction problem E Luque, R Suppi and J Sorribes PSEE (Parallel System Evaluation Environment) is a software tool that provides a multiprocessor system for research into alternative architectural decisions and experimentation, with such issues as selection, design, tuning, scheduling, clustering and routing policies. PSEE facilitates simulation and performance evaluation as well as a prediction environment Jbr the design and tuning of parallel systems. These tasks involve cycles through programming, simulation, measurement, visualization and modification of parallel system parameters. PSEE includes a parallel programming tool, a simulator for link oriented parallel systems, BOLAS, and a perJbrmance evaluation tool, GRAPH. These PSEE modules are tools oriented to support the above tasks in user-friendly, interactive and animated graphical form. PSEE provides quantitative information in a graphical tailored form. This numerical~graphical output helps the user make decisions about his~her particular development. parallel systems, perJbrmance evaluation, architecture Parallel computer systems make possible the fast ex- ecution of a vast set of algorithms which can only be handled by conventional uniprocessors with great difficulty. Thus, the implementation of an algorithm on one manufacturer's uniprocessor will differ in speed by no more than a constant factor from that on another's, but in parallel machines, we have no such guarantees. Parallel computer systems are much more complex than uniprocessors, and this complexity causes many design and implementation problems. One such problem is the development of working models for evaluating behaviour and performance of parallel computer systems under varying conditions. Since these systems are expensive to build, the need for such alternatives is great. On the other hand, as the number of commercial systems is increased, so is the diversity of their architec- tural design. Because each new architecture diverged from the classical von Neuman model, new languages and updated versions of older sequential languages were developed for execution on these new machines. This made it difficult to run a standard benchmark. To date, no formal methods allow comparisons among Departament d'Informdtica, Universitat Aut6noma de Barcelona, Bellaterra (08193), Spain. Email: llNFD@ccuabi.uab.es This paper was first published in Microprocessors and Microsystems Vol 16 No I different machines running a single common application. Furthermore, the code is generally not portable among different parallel processing machines. This forces appli- cations to be recoded for each language and each machine j.s. This situation gives a new dimension to parallel systems complexity. Complexity in parallel systems has two main aspects: How to obtain the best algorithm with the maximum parallelism in order to represent the solution or the target sequential algorithm. In spite of the large number of basic elements available for parallel pro- gramming (for expert programmers), solving this problem is not an easy task. A first approach is to use previous experience in order to generate the co- ordination and synchronization tasks for concurrent processes sharing common variables and data struc- tures. At present, these tasks are very common in conventional operating systems like UNIX, but only in single-processor systems. Once we have the algorithm which represents the function and behaviour of a set of applications (Virtual Algorithm ModelLT), the goal is to select the best architecture and its tuning for this algorithm. Parameters such as number of processors, interconnec- tion topology, clustering, mapping, scheduling policies and routing strategies are present in a parallel system but not in serial system design. So this includes new degrees of freedom and greater complexity in the process of architecture selection for a given application. In order to approach the above problem an integrated environment for parallel system evaluation has been conceived and implemented. The set of tools included in this environment is oriented towards distributed- memory message-passing MIMD systems. In this kind of parallel system (link oriented), the main parameters to be considered in the selection of an optimum architecture are the following: Granularity. An algorithm may have maximum paral- lelism but it may not be the best solution to be executed on a particular parallel machine. A measurement of the ratio between the computing time in a processor and its communication time with other processors is the granu- larity. An application that permits changing the granu- larity is an algorithm which with a particular clustering Vol 34 No 12 December 1992 0950-5849/92/120813-11 © 1992 Butterworth-Heinemann Ltd 813