A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices ¨ Umit V. C ¸ ataly¨ urek Dept. of Pathology, Division of Informatics Johns Hopkins Medical Institutions Baltimore, MD 21287 umit@jhmi.edu Cevdet Aykanat Computer Engineering Department Bilkent University Ankara, 06533 Turkey aykanat@cs.bilkent.edu.tr Abstract We propose a new hypergraph model for the decompo- sition of irregular computational domains. This work fo- cuses on the decomposition of sparse matrices for parallel matrix-vector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction problems. We propose a “fine- grain” hypergraph model for two-dimensional decomposi- tion of sparse matrices. In the proposed fine-grain hyper- graph model, vertices represent nonzeros and hyperedges represent sparsity patterns of rows and columns of the ma- trix. By partitioning the fine-grain hypergraph into equally weighted vertex parts (processors) so that hyperedges are split among as few processors as possible, the model cor- rectly minimizes communication volume while maintaining computationalload balance. Experimental results on a wide range of realistic sparse matrices confirm the validity of the proposed model, by achieving up to 50 percent better de- compositions than the existing models, in terms of total com- munication volume. 1 Introduction Repeated matrix-vector multiplication that in- volves the same large, sparse, structurally symmetric or non- symmetric square matrix is the kernel operation in itera- tive solvers. These algorithms also involve linear operations on dense vectors. For efficient parallelization of these itera- tive algorithms, matrix should be partitioned among pro- cessors in such a way that communication overhead is kept low while maintaining computational load balance. In or- der to avoid the communication of vector components dur- ing the linear vector operations, a symmetric partitioning scheme is adopted. That is, all vectors (including and vectors) used in the solver are divided conformally. This work is partially supported by Turkish Science and Research Council under grant EEEAG-199E013. The standard graph partitioning approach has been widely used for one-dimensional (1D) decomposition of irregularly sparse matrices. In recent works, we [3, 4], and Hendrickson [9] mentioned the flaws and shortcom- ings of the standard graph-partitioning approach. In our recent works [3, 4], we proposed hypergraph-partitioning approach which correctly minimizes the communication volume in 1D matrix decomposition. Other recently pro- posed alternative models for 1D matrix decomposition were discussed in the excellent survey by Hendrickson and Kolda [10]. The literature that addresses 2D matrix decomposition is very rare. The 2D checkerboard decomposition schemes proposed by Hendrickson et al. [11] and Lewis and van de Geijn [15] are typically suitable for dense matrices or sparse matrices with structured nonzero patterns that are difficult to exploit. These schemes do not involve explicit effort to- wards reducing communication volume. Parallel matrix-vector multiplication is one of the basic parallel reduction algorithms. Elements of vector are the inputs of the reduction and elements of vector are the out- puts of the reduction. Matrix corresponds to the mapping matrix from input elements to output elements. Hence, any technique used in the sparse matrix decomposition is also applicable to other reduction problems. In this paper, we propose a fine-grain hypergraph- partitioning model for 2D decomposition of irregularly sparse matrices based on our previous work [2]. Vertices of the proposed fine-grain hypergraph correspond to the nonze- ros of the matrix to model each scalar multiplication oper- ation as an atomic task during the decomposition. Hyper- edges of the fine-grain hypergraph correspond to columns and rows of the matrix to model the communication volume requirement of the expand and fold operations in the par- allel matrix-vector multiplication. By partitioning the fine- grain hypergraph into equally weighted vertex parts (proces- sors) so that hyperedges are split among as few processors as possible, the model correctly minimizes communication volume while maintaining computational load balance.