Under consideration for publication in J. Functional Programming 1 Shared Memory Multiprocessor Support for Functional Array Processing in SAC CLEMENS GRELCK Universit¨ at zu L¨ ubeck Institut f¨ ur Softwaretechnik und Programmiersprachen Ratzeburger Allee 160, 23538 L¨ ubeck, Germany (e-mail: grelck@isp.uni-luebeck.de) Abstract Classical application domains of parallel computing are dominated by processing large arrays of numerical data. Whereas most functional languages focus on lists and trees rather than on arrays, SaC is tailor-made in design and in implementation for efficient high-level array processing. Advanced compiler optimizations yield performance levels that are often competitive with low-level imperative implementations. Based on SaC, we develop compilation techniques and runtime system support for the compiler-directed parallel execution of high-level functional array processing code on shared memory architectures. Competitive sequential performance gives us the opportu- nity to exploit the conceptual advantages of the functional paradigm for achieving real performance gains with respect to existing imperative implementations, not only in com- parison with uniprocessor runtimes. While the design of SaC facilitates parallelization, the particular challenge of high sequential performance is that realization of satisfying speedups through parallelization becomes substantially more difficult. We present an initial compilation scheme and multi-threaded execution model, which we step-wise refine to reduce organizational overhead and to improve parallel performance. We close with a detailed analysis of the impact of certain design decisions on runtime performance, based on a series of experiments. Contents 1 Introduction 2 2 Arrays and array operations 8 3 Generating multi-threaded code 12 4 With-loop scheduling 21 5 Enhancing the execution model 25 6 SPMD optimization 29 7 Experimental evaluation 35 8 Conclusion 43 References 45