A New Parallel Paradigm for Block-based Gauss-Jordan Algorithm
Ling Shang, Serge Petiton and Maxime Hugues
LIFL, University of Science and Technology of Lille
Grand-Large Team, INRIA Futurs
Lille, France
(ling.shang, serge.petiton, maxime.hugues)@lifl.fr
Abstract—Two kinds of parallel possibilities which are intra-
step and inter-steps parallelism exist in the block-based Gauss-
Jordan algorithm which is a classical method of large scale
matrix inversion. But the existing parallel paradigm of Block-
based Gauss-Jordan algorithm just aiming at the intra-step
parallelism, can’t meet the requirement of making more tasks
executed simultaneously in high performance platform can be
harnessed more and more computing resources. To overcome
the problem described above, this paper presents a hybrid
parallel paradigm exploiting all the possible parallelizable
parts of the Gauss-Jordan algorithm. In this hybrid parallel
paradigm, 1) Divide and conquer paradigm is responsible for
decomposing the large granularity task into sub-tasks as much
as possible; 2) Single program multi data (SPMD) paradigm
deals with intra-step parallelism in the algorithm; 3) Data
pipelining paradigm helps to solve the problem of inter-steps
parallelism. Finally some experiments based on comparison the
hybrid parallel paradigm with the existing parallel paradigm
show us the good performance of our paradigm.
Keywords-Gauss-Jordan algorithm; parallel paradigm; data
dependence, parallelism
I. I NTRODUCTION
A good parallel programming paradigm should help to
maximize parallel execution of the algorithm, thus achieving
better performance. And the choice of paradigm is deter-
mined by the available parallel computing resources and by
the type of parallelism inherent in the problem [14].
Exploiting the significant computational capability avail-
able in the internet-based Grid environment has gained
an enthusiastic acceptance within the high performance
computing community, and the current tendency favors this
sort of commodity supercomputing [10]. Multi-core Archi-
tectures (MCAs) provide applications with an opportunity
to achieve much higher performance and the number of
cores on MCAs is likely to continue growing, increasing the
performance potential of MCAs [12]. All those technologies
are mainly motivated by the fact that most of the scientific
communities have the desire to minimize economic risk
and rely on consumer based off-the-shelf technology. Grid
computing and multi-core have been recognized as the wave
of the future to solve large scientific problems. However, re-
alizing this performance potential in an application requires
the application to expose a significant amount of thread-
level parallelism. It is important to find a solution to get
maximal parallelism in a certain algorithm for researchers,
thus exploiting computing resources in the Grid platform or
MCAs as much as possible.
Block-based Gauss-Jordan algorithm [1][2][4], as a clas-
sical method of large scale matrix inversion, can be used in
weather prediction, aircraft design, graphic transformation
and so on. Its high availability in many domains makes
it become the focus of many researchers. Serge shows the
parallel version of the algorithm adapting to MIMD [1]. N.
Melab et al not only give us its parallel version tailoring
to MARS but also analyze all the possible parallelism
in the algorithm [2][4]. L. M. Aouad et al present its
parallel programming paradigm based on SPMD [7]. As
well known to us all, paradigm is a class of algorithms
that have the same control structure and we can very easily
tailor it to any execution models such as MPI, PVM and
other middleware suiting for high performance computing.
And a good programming paradigm is very important for
an algorithm to get better performance. But programming
paradigm given by N. Melab and L. M. Aouad doesn’t
take inter-steps parallelism into account. To improve the
efficiency of the algorithm, it is important and necessary to
find a solution which can exploit all the inter-steps and intra-
step parallelism in the algorithm, thus generating more tasks
and making these tasks executed simultaneously. So analysis
based on the sequential block-based Gauss-Jordan algorithm
has been made and some characters can be summarized as
follows: 1) All the objects of operation are data blocks and
the sequence of operations in the algorithm is decided by
data dependence between different blocks; 2) The number of
steps of algorithm execution is equal to the number of data-
blocks divided into; 3) the parallelism of basic operation
in the algorithm depends on the data write-operation ; 4)
the number of data write-operation is same in each iterative
step. This analysis can tell us that data dependence between
different blocks in the algorithm plays a very important
role. So this paper emphasizes on the analysis of data
dependence of different blocks and table is used to simulate
the real matrix manipulation. Then formal description based
on table simulation is made to demonstrate the existed
data dependence between different blocks. At the same
time, write-operation which controls the data dependence
2009 Eighth International Conference on Grid and Cooperative Computing
978-0-7695-3766-5/09 $25.00 © 2009 IEEE
DOI 10.1109/GCC.2009.75
201
2009 Eighth International Conference on Grid and Cooperative Computing
978-0-7695-3766-5/09 $25.00 © 2009 IEEE
DOI 10.1109/GCC.2009.75
201
2009 Eighth International Conference on Grid and Cooperative Computing
978-0-7695-3766-5/09 $25.00 © 2009 IEEE
DOI 10.1109/GCC.2009.75
193
2009 Eighth International Conference on Grid and Cooperative Computing
978-0-7695-3766-5/09 $25.00 © 2009 IEEE
DOI 10.1109/GCC.2009.75
193