Parallel Computing 9 (1988/89) 291-312 North-Holland 291 A projection method for solving nonsymmetric linear systems on multiprocessors * Chandrika KAMATH and Ahmed SAMEH Center for Supercomputing Research and Development, University of lilinois-Urbana, 305 Talbot Laboratory, 104 South Wright Street, Urbana, IL 61801-2932, U.S.A. Received March 1987 Revised January 1988 Abstract. We consider the iterative solution of large sparse linear systems of equations arising from elliptic and parabolic partial differential equations in two or three space dimensions. Specifically, we focus our attention on nonsymmetric systems of equations whose eigenvalues lie on both sides of the imaginary axis, or whose symmetric part is not positive definite. This system of equations is solved using a block Kaczmarz projection method with conjugate gradient acceleration. The algorithm has been designed with special emph~is on its suitability for multiprocessors. In the first part of the paper, we study the numerical properties of the algorithm and covapare its performance with other algorithms ~uch as the conjugate gradient method on the normal equations, and conjugate gradient-like schemes such as ORTHOMIN(k), GCR(k) and GMRES(k). We also study the effect of using various preconditioners with these methods. In the second part of the paper, we describe the implementation of our algorithm on the CRAY X-MP/48 multiprocessor, and study its behavior as the number of processors is increased. Keywords. Conjugate gradient algorithm, CRAY X i~iP/48, multiprocessors, nonsymmetric systems, projection methods, symmetric successive overrelaxation. 1. Introduction Many problems in engineering and science give rise to one of the most fs,~damental problems of linear algebra--that of solving linear systems of equations. The use of complex models to describe physical systems often results in handling large sparse linear systems of equations. Since these systems usually occur in the innermost loop of the computational scheme, fast and efficient methods for their solution are very important. One way of speeding up the solution of these systems is by the use of parallel computers. While vector machines such as the CRAY-1 and CYBER 205 have provided speedups over sequential machines, even higher speedups are desired. This increase in speedup can be achieved only through the use of parallelism offered by multiprocessors such as the CRAY X-MP/48 or the ALLIANT FX/80, for example. This, in turn, requires developing algorithms that exploit both concurrency and vectorization. * This work was supported in part by the National Science Foundation under Grant Nos. US NSF DCR8&10110 and US NSF DCR85-09970, the U.S. Department of Energy under Grant No. US DOE-DE-FG02-85ER25001, the U.S. Air Force Office of Scientific Research under Grant No. AFOSR-85-0211, the IBM Donation, and Digital Equipment Corporation. 0167-8191/89/$3.50 © 1989, Elsevier Science Publishers B.V. (North-Holland)