International Journal of Scientific & Engineering Research, Volume 5, Issue 12, December-2014 344
ISSN 2229-5518
IJSER © 2014
http://www.ijser.org
Survey on Particle Swarm Optimization
accelerated on GPGPU
Joanna Kołodziejczyk
Abstract— The paper presents an overview of recent research on the Particle Swarm Optimization (PSO) algorithm parallelization on the
Graphics Processing Unit for general-purpose computations (GPGPU). This survey attempts to collect, organize, and present reports in the
area published since 2007 in a unified way. In order to organize the literature a classification by objective functions and PSO variants is
proposed. The paper also compares experimental results taking into account the most popular factor, the calculating acceleration ratio
called speedup. Results of the survey are given in a very compact and comprehensive way and could be used as a guide in this area. As a
summary, conclusions from categorization, a comparability problem, and possible research areas are discussed.
Index Terms—General-Purpose computing on Graphics Processor Units, NVDIA CUDA, Particle Swarm Optimization
—————————— ——————————
1 INTRODUCTION
HE Particle Swarm Optimization algorithm is a popular
tool for continuous domains exploration presented for the
first time in [1]. The main PSO attributes are: 1) it finds a
satisfactory solution for complex and large-scale problems 2) it
converges fast 3) it is easy to implement 4) the number of
adjustable factors is relatively small. The major problem with
the practical PSO implementation is its runtime especially in
multidimensional optimization tasks.
One of the most promising choices to speed up the
computational process is the use of parallel implementations.
All algorithms based on the population/swarm are ideally
suited for parallelization, including PSO. Starting in 2001
developers can use GPUs, which are high-performance
parallel accelerators. A PC equipped with a programmable
graphics unit can be perceived as a dual processors device,
where depending on the calculations, tasks can be split
between GPU and CPU.
Due to the wide availability, programmability, and high-
performance of consumer level GPUs, NVIDIA corporation
invented the Compute Unified Device Architecture (CUDA)
platform and implemented it on GPUs they produce. This
programming model becomes very popular because it eases
the GPUs code development. The CUDA platform allows
writing GPU code in C functions called kernels. Many GPU
threads in a Single-Instruction-Multiple-Thread (SIMT)
fashion execute each kernel. Each thread executes the entire
kernel once [2].
GPGPU popularity as a platform for parallel
implementation of population based meta-heuristic
optimization methods resulted in two publications presenting
a summary of recent results in the area. Kromer et al. [3]
presented a general description of twenty-three GPGPU PSO
implementations from the CUDA programming point of view.
A summary of optimization problems, data organization and
most interesting results and problems were given. The second
report by Kromer et al. [4] provides a brief overview of the
latest state-of-the-art research on the design, implementation,
and applications of parallel GA, DE, PSO, and SA-based
methods on GPUs. The authors shortly described all presented
meta-heuristics and gave a detailed description of the parallel
CUDA programming model. They described eighteen PSO
GPGPU implementations between 2012 and 2014, giving
information about: the application area, the most important
results and when possible the graphic card used. Both Kromer
et al. surveys lack a method for literature classification or
organization.
The objective of this paper is to collect, organize and
present publications on GPGPU PSO implementations. In
order to organize the growing amount of literature in this
field, the paper presents a categorization of the different types
of GPU PSO implementations. Categories come from the
implementation diversity (standard benchmark functions or
real-world optimization problems) and concern PSO
algorithm variants. Other attributes, which helped in the
papers’ organization, were chosen in order to compare
experimental results (runtime, speedup ratio, and
effectiveness in the optimum discovery).
This paper is organized as follows. The next section is a
brief introduction to the particle swarm algorithm and
indicates categories coming from its different variants. Section
3 describes objective functions applied in the literature.
Section 4 presents emerged categories used in the paper
classification. Section 5 shows the literature analysis and
discussion. The conclusions describe the comparability
problem and further research areas.
2 PSO ALGORITHM VARIANTS
This section briefly describes the PSO algorithm in his
standard version. Subsections present different PSO variants
distinguished based on the velocity update rule,
neighborhood and number of swarms. PSO variations will be
used as categories in the literature organization.
T
————————————————
• Joanna Kołodziejczyk is currently assistant professor at Faculty of Computer
Science and Information Technology, West Pomeranian University of
Technology, Szczecin, Poland. E-mail: jkolodziejczyk@wi.zut.edu.pl
IJSER