Parallelism without Pain: Orchestrating Computational Algebra Components into a High-Performance Parallel System A. D. Al Zain #1 , P. W. Trinder #2 , K. Hammond *3 , A. Konovalov *4 , S. Linton *5 , J. Berthold +6 # School of Mathematics and Computer Sciences, Heriot-Watt University, Edinburgh, UK * School of Computer Science, University of St Andrews, St Andrews, UK + Philipps-Universit¨ at Marburg, Fachbereich Mathematik und Informatik Hans Meerwein Straße, Marburg, Germany 1 a.d.alzain@hw.ac.uk, 2 p.w.trinder@hw.ac.uk, 3 kh@cs.st-and.ac.uk, 4 alexk@mcs.st-and.ac.uk, 5 sal@cs.st-and.ac.uk, 6 berthold@informatik.uni-marburg.de Abstract This paper describes a very high-level approach that aims to orchestrate sequential components written us- ing high-level domain-specific programming into high- performance parallel applications. By achieving this goal, we hope to make parallel programming more accessible to experts in mathematics, engineering and other domains. A key feature of our approach is that parallelism is achieved without any modification to the underlying sequential com- putational algebra systems, or to the user-level compo- nents: rather, all orchestration is performed at an outer level, with sequential components linked through a standard communication protocol, the Symbolic Computing Software Composability Protocol, SCSCP. Despite the generality of our approach, our results show that we are able to achieve very good, and even, in some cases, super-linear, speedups on clusters of commodity workstations: up to a factor of 33.4 on a 28-processor cluster. We are, moreover, able to parallelise a wider variety of problem, and achieve higher performance than typical specialist parallel computational algebra implementations. 1 Introduction Despite the availability of standard message-passing li- braries, such as PVM/MPI, and other modern programming tools, writing effective parallel programs is still hard, even for those with a good background in computer science. For domain experts in other specialisms, the difficulty is com- pounded. At the same time, support for parallelism in domain-specific programming environments is often poor or even completely lacking. It is fair to say that for the average non-specialist, while the situation has certainly im- proved compared with the experiences of pioneer parallel programmers, effective parallel programming still involves considerable pain. In this paper, we describe the design and implementation of a new system, SymGrid-Par, which aims to orchestrate sequential computational algebra components from a vari- ety of computational algebra systems into coherent parallel programs. A wide variety of computational algebra systems exist today: commercial examples include Maple [8], Math- ematica [29] and MuPAD [31]; while free examples include Kant [12] and GAP [21]. The programmer base for these systems is large (for example, it is estimated that there sev- eral million users of Maple world wide), diverse (compris- ing mathematicians, engineers, scientists and economists), and may lack the technical skills that are necessary to ex- ploit widely-used packages such as PVM/MPI. Although many computational algebra applications are computationally intensive, and could, in principle, make good use of the cheap and available parallel architectures of cluster machines, relatively few parallel implementations of computational algebra systems have been produced. Those that are available can be unreliable and difficult to use. In- deed, in at least one case [10], the underlying computational algebra engine has been explicitly optimised to be single- threaded, so rendering parallelisation a major and daunting task. By providing an external mechanism that is capable of orchestrating individual sequential components into a co- herent parallel program, we aim to facilitate the parallelisa- tion of a variety of computational algebra systems in a way that can be exploited by the normal users of computational algebra systems. The key advantages of our approach are: i) by using ex- ternal middleware, it is not necessary to change the sequen- 2008 International Symposium on Parallel and Distributed Processing with Applications 978-0-7695-3471-8/08 $25.00 © 2008 IEEE DOI 10.1109/ISPA.2008.19 99 2008 International Symposium on Parallel and Distributed Processing with Applications 978-0-7695-3471-8/08 $25.00 © 2008 IEEE DOI 10.1109/ISPA.2008.19 99 2008 International Symposium on Parallel and Distributed Processing with Applications 978-0-7695-3471-8/08 $25.00 © 2008 IEEE DOI 10.1109/ISPA.2008.19 99 2008 International Symposium on Parallel and Distributed Processing with Applications 978-0-7695-3471-8/08 $25.00 © 2008 IEEE DOI 10.1109/ISPA.2008.19 99