Performance Evaluation of a Parallel Dynamic
Programming Algorithm for Solving the
Matrix Chain Product Problem
Bchira BEN MABROUK
1
, Hamadi HASNI
2
1
Higher Institute of Applied Sciences and Technologies,
University of Carthage, 7030, Mateur, Tunisia
benmabrouk_bchira@yahoo.fr
2
National School of Computer Science
University Campus of Manouba , 2010, Manouba, Tunisia
Hamadi.Hasni@ensi.rnu.tn
Zaher MAHJOUB
University of Tunis El Manar
Faculty of Sciences of Tunis
University Campus
2092, Manar II, Tunis, Tunisia
Zaher.Mahjoub@fst.rnu.tn
Abstract—We address in this paper a particular combinato-
rial optimization problem (COP) namely the matrix chain
product problem (MCPP). We particularly consider the
parallelization of the dynamic programming algorithm (DPA) for
solving the MCPP which is structured in a DO loop nest of depth
3. Our approach is based on a three-phase procedure. The first
consists in transforming the DPA into a perfect loop nest (PLN).
The second applies a dependency analysis within the initial PLN
permitting the determination of the type of each loop (serial or
parallel). As to the third phase, it applies on the initial PLN the
loop interchange technique in order to increase the parallelism
degree. We focus in this paper on an experimental study achieved
on a parallel multicore machine that permits to validate our
theoretical contribution.
Keywords— combinatorial optimization problem; dependence
analysis; DO loop nest; dynamic programming; loop interchange;
matrix chain product; multicore machine; parallelization;
performance evaluation; polyhedral algorithm.
I. INTRODUCTION
Dynamic programming (DP) is an efficient paradigm for
the design of algorithms solving a large class of combinatorial
optimisation problems (COP). DP algorithms (DPA) have the
particular structure of DO loop nests and are, in most cases, of
polynomial complexity. Such algorithms are also polyhedral
algorithms. Given an input COP, the DP paradigm adopts a
bottom-up approach leading to first solving sub-problems
whose solutions are used to solve sub-problems of larger size.
The procedure is then iterated until determining the solution of
the input problem. The key idea is to express, through a
recurrence formula, the solution of the initial problem in terms
of the solutions of its son sub-problems [10].
We are particularly interested in this paper in the DP
paradigm for solving the matrix chain product problem
(MCPP). This problem has in fact diverse real world
applications e.g in robotics, process control, computer
animation [31]. Our aim here is to use several versions of the
DPA for the MCPP and study their parallelization. Indeed, a
detailed theoretical study, based on a previous brief
presentation [4] is first given. In addition, we focus on an
experimental study targeting a multicore machine in order to
achieve an accurate performance evaluation of the designed
parallel DPAs that permits to validate our contribution.
The remainder of the paper is organised as follows. In
section 2, we first present the MCPP and the associated DPA
for solving it, then a state-of-the art on previous works
including sequential and parallel algorithms. Section 3 is
devoted to a description of our parallelization approach. An
experimental study is described in section 4. Finally we
conclude our work in section 5 and propose some
perspectives.
II. THE MATRIX CHAIN PRODUCT PROBLEM (MCPP)
A. Presentation
The Matrix chain product problem (MCPP) is a
combinatorial optimization problem (COP) consisting in
finding an optimal parenthesization of a chain of rectangular
matrices i.e. that minimizes the total number of required
multiplications [10]. Indeed, if we consider a chain of three or
more matrices to be multiplied, the total number of
multiplications may vary depending on the chain
parenthesization.
To be convinced, consider for instence three matrices A,
B, and C, of size 5×1, 1×5, and 5×1, respectively. The product
ABC may be done in two ways i.e. according to two
parenthesizations namely either (AB)C or A(BC). Clearly, the
first requires 5×1×5 + 5×5×1 = 50 multiplications while the
second requires only 1×5×1 + 5×1×1 = 10 multiplications.
For a chain involving n matrices, n being large, we cannot
afford trying alla the possible parenthesizations since their
number, called the Catalan number, is equal to Ω(4
n
/n
3/2
)
[10][22].
978-1-4799-7100-8/14/$31.00 ©2014 Crown
109