International Journal of Scientific & Engineering Research, Volume 5, Issue 12, December-2014 344 ISSN 2229-5518 IJSER © 2014 http://www.ijser.org Survey on Particle Swarm Optimization accelerated on GPGPU Joanna Kołodziejczyk Abstract— The paper presents an overview of recent research on the Particle Swarm Optimization (PSO) algorithm parallelization on the Graphics Processing Unit for general-purpose computations (GPGPU). This survey attempts to collect, organize, and present reports in the area published since 2007 in a unified way. In order to organize the literature a classification by objective functions and PSO variants is proposed. The paper also compares experimental results taking into account the most popular factor, the calculating acceleration ratio called speedup. Results of the survey are given in a very compact and comprehensive way and could be used as a guide in this area. As a summary, conclusions from categorization, a comparability problem, and possible research areas are discussed. Index Terms—General-Purpose computing on Graphics Processor Units, NVDIA CUDA, Particle Swarm Optimization —————————— —————————— 1 INTRODUCTION HE Particle Swarm Optimization algorithm is a popular tool for continuous domains exploration presented for the first time in [1]. The main PSO attributes are: 1) it finds a satisfactory solution for complex and large-scale problems 2) it converges fast 3) it is easy to implement 4) the number of adjustable factors is relatively small. The major problem with the practical PSO implementation is its runtime especially in multidimensional optimization tasks. One of the most promising choices to speed up the computational process is the use of parallel implementations. All algorithms based on the population/swarm are ideally suited for parallelization, including PSO. Starting in 2001 developers can use GPUs, which are high-performance parallel accelerators. A PC equipped with a programmable graphics unit can be perceived as a dual processors device, where depending on the calculations, tasks can be split between GPU and CPU. Due to the wide availability, programmability, and high- performance of consumer level GPUs, NVIDIA corporation invented the Compute Unified Device Architecture (CUDA) platform and implemented it on GPUs they produce. This programming model becomes very popular because it eases the GPUs code development. The CUDA platform allows writing GPU code in C functions called kernels. Many GPU threads in a Single-Instruction-Multiple-Thread (SIMT) fashion execute each kernel. Each thread executes the entire kernel once [2]. GPGPU popularity as a platform for parallel implementation of population based meta-heuristic optimization methods resulted in two publications presenting a summary of recent results in the area. Kromer et al. [3] presented a general description of twenty-three GPGPU PSO implementations from the CUDA programming point of view. A summary of optimization problems, data organization and most interesting results and problems were given. The second report by Kromer et al. [4] provides a brief overview of the latest state-of-the-art research on the design, implementation, and applications of parallel GA, DE, PSO, and SA-based methods on GPUs. The authors shortly described all presented meta-heuristics and gave a detailed description of the parallel CUDA programming model. They described eighteen PSO GPGPU implementations between 2012 and 2014, giving information about: the application area, the most important results and when possible the graphic card used. Both Kromer et al. surveys lack a method for literature classification or organization. The objective of this paper is to collect, organize and present publications on GPGPU PSO implementations. In order to organize the growing amount of literature in this field, the paper presents a categorization of the different types of GPU PSO implementations. Categories come from the implementation diversity (standard benchmark functions or real-world optimization problems) and concern PSO algorithm variants. Other attributes, which helped in the papers’ organization, were chosen in order to compare experimental results (runtime, speedup ratio, and effectiveness in the optimum discovery). This paper is organized as follows. The next section is a brief introduction to the particle swarm algorithm and indicates categories coming from its different variants. Section 3 describes objective functions applied in the literature. Section 4 presents emerged categories used in the paper classification. Section 5 shows the literature analysis and discussion. The conclusions describe the comparability problem and further research areas. 2 PSO ALGORITHM VARIANTS This section briefly describes the PSO algorithm in his standard version. Subsections present different PSO variants distinguished based on the velocity update rule, neighborhood and number of swarms. PSO variations will be used as categories in the literature organization. T ———————————————— Joanna Kołodziejczyk is currently assistant professor at Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, Poland. E-mail: jkolodziejczyk@wi.zut.edu.pl IJSER