Classification of Massively Parallel Computer
Architectures
Muhammad Ali Shami
Department of ES
Royal Institute of Technology
16440 Kista, Stockholm
Email: shami@kth.se
Ahmed Hemani
Department of ES
Royal Institute of Technology
16440 Kista, Stockholm
Email: hemani@kth.se
Abstract—Faced with slowing performance and energy benefits
of technology scaling, VLSI/Computer architectures have turned
from parallel to massively parallel machines for personal and
embedded applications in the form of multi and many core
architectures. Additionally, in the pursuit of finding the sweet
spot between engineering and computational efficiency, massively
parallel Coarse Grain Reconfigurable Architectures(CRGAs)
have been researched. While these articles have been surveyed,
they have not been rigorously classified to enable objective
differentiation and comparison for performance, area and flex-
ibility. In this paper, we extend the well known Skillicorn
taxonomy to create new classes, present a scoring system to
rate these classes on flexibility, and present equations for early
estimation of area and configuration overheads. Furthermore, we
use this extended classification scheme to classify and compare
25 different massively parallel architectures that covers most of
the reported CGRAs and other well known multi and many core
architectures.
I. I NTRODUCTION
Parallelism is becoming mainstream. This is reflected in
the widespread adoption of multi-processor architectures that
powers desktop PCs, mobile phones, video games, graphics
and personal high performance scientific computing that uses
general purpose GPUs. The research community is also de-
voting increased attention to all aspects of research in parallel
computing and programming models as can be seen in figure 1.
The figure shows the number of publications in different field
of parallel computing for last 15 years. It can be seen that
the research interest in parallel computing specially in multi-
core and reconfigurable computer architectures has increased
significantly in the last five years. The seminal survey on Land-
scape of Parallel Computing[1] from Berkeley attests to this
trend in increased interest from the academic world. Massively
parallel reconfigurable computing both fine grain FPGAs and
Coarse Grain Reconfigurable Architectures(CGRAs) have also
been subject of intense research and is finding widespread
adoption in industry. These massively parallel reconfigurable
computing architectures have also been surveyed by Reiner
Hartenstein [2], Raj Krishnamurthy[3] and Max Baron[4].
Survey papers are of great value to the research community,
especially new participants, as it provides a single source of
information on the most significant research and the challenges
in a specific domain. This paper complements the survey
papers on reconfigurable computing and the more wider survey
paper on the parallel computing landscape[1] by providing
a classification scheme that would enable the community
to objectively compare and differentiate present and newer
parallel computing architectures, including multi/many core
architectures, CGRAs and FPGAs.
Flynn[5] classified computer architectures into four cate-
gories SISD, SIMD, MISD and MIMD. This taxonomy is
perhaps the oldest, simplest and the most widely known.
Skillicorn[6] citing the broadness of Flynn’s taxonomy as a
limitation introduced a new way of classifying the computer
architectures in a more comprehensive manner. He used Data
Processor (DP), Instruction Processor (IP), Data Memory
(DM), Instruction Memory (IM) as the building blocks of a
computer architectures and classified them according to the
their number (0, 1 or n) and how they are connected one to one
or one to many. The limitation in Skillicorn’s taxonomy is the
granularity of building blocks. Because of higher granularity,
these building blocks cannot exchange their roles. Therefore
the number of these elements in a computer architecture
will remain fixed (0,1, or n). This limitation restricts the
application of Skillicorn’s taxonomy on modern reconfigurable
architecture(FPGA, CGRAs), where the basic blocks are of
finer granularity (gates, LUT, CLBs) and can assume the role
of either IP, DP or a memory element. So the number of IP,
DP or memory elements in these architecture changes upon
reconfiguration and is variable denoted by symbol ’v’.
In this paper, we use Skillicorn’s taxonomy as the starting
point and extend it to cover the richer parallel and reconfig-
urable computing landscape that has emerged since the paper
was published in 1988. We also improve the predictive power
of this taxonomy by naming the classes according to some
rules. We also define flexibility and give each class a relative
flexibility value to compare them against each other. Using our
extended classification, we also try to predict the area of the
target architecture. Finally We apply the Skillicorn’s extended
taxonomy on existing parallel and reconfigurable architectures
like multi/many core architectures, CGRAs and FPGAs.
Section-II extends the skillicorn’s original taxonomy to ap-
ply on richer parallel and reconfigurable computing landscape.
Section-III talks about the predictive power of this taxon-
omy. Section-IV applies this taxonomy on modern parallel
and reconfigurable architecture and finally Section V gives
2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops
978-0-7695-4676-6/12 $26.00 © 2012 IEEE
DOI 10.1109/IPDPSW.2012.42
337
2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
978-0-7695-4676-6/12 $26.00 © 2012 IEEE
DOI 10.1109/IPDPSW.2012.42
337
2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
978-0-7695-4676-6/12 $26.00 © 2012 IEEE
DOI 10.1109/IPDPSW.2012.42
344