Solving the Constant-Degree Parallelism Alignment Problem Claude G. Diderich 1~ and Marc Gengler 2 1 Swiss Federal Institute of Technology - Lausanne, Computer Science Department, CH-1015 Lausanne, Switzerland, E-mail: diderich@di.epfl.ch 2 Ecole Normale Sup4rieure de Lyon, Laboratoire de l'Informatique du Parall41isme, F-69364 Lyon, France, F_~mail: Marc. Gengler 9 ens-lyon, fr Abstract. We describe an exact algorithm for finding a computation map- ping and data distributions that minimize, for a given degree of parallelism, the number of remote data accesses in a distributed memory parallel com- puter (DMPC). This problem is shown to be NP-hard. 1 The alignment problem An important problem when compiling nested loops towards DMPCs is how to map the computation and the data onto processors. This problem can be subdivided into two subproblems: 1) the alignment problem which assigns computation and data to a set of virtual processors, and 2) the mapping problem which folds the set of virtual processors onto the physical ones. In this paper we address the alignment problem. Following the linear algebra formulation of the alignment problem by Huang and Sadayappan [6] in 1991, researchers have primarily focused on finding linear or affine computation and data alignment functions requiring no remote data accesses [3] or on developing heuristics for minimizing communication [2]. The alignment problem is the problem of finding an alignment of loop iterations with the array elements accessed, that is, mappings of the loop iterations and array elements to a set of virtual processors. The alignment should acldress the two needs: i) maximize the degree of parallelism, i.e. use as many processors as possible, ii) minimize the number of non local data accesses, i.e. distribute the array elements such that a processor owns a maximal number of the elements it accesses. Depending on how the needs i) and ii) are verified, various subproblems can be defined. When allowing only local data accesses, we talk about the communication-free alignment problem. Another subproblem is defined by minimizing the number of remote data accesses for a given degree of parallelism. This subproblem is called the constant- degree parallelism alignment problem. We consider array access functions that are linear or affine and use the approach presented by Bau et al. [3] for expressing the alignment problem. Access l to array k is described by a function F~. The unknown computation mappings Cj and data mappings Dk can also be written as matrix functions. Z represents the index domain defined by the loop bounds, :Dk the array access domain and P the virtual multi-dimensional grid of processors. * Supported by a grant from the Swiss Federal Institute of Technology - Lausanne.