Advances in Computing, Communication and Control Vol 15, 2011 Implementation of Parallel Image Processing using NVIDIA GPU Framework Brijmohan Daga 1 , Avinash Bhute 2 , Ashok Ghatol 3 , 1 Fr. Conceicao Rodrigues College of Engineering, Mumbai, India 2 V.J.T.I. Mumbai, India 3 D. Y. Patil Group of Institution, Pune, India bsdaga@yahoo.com, anbhute@gmail.com, ashok.ghatol@gmail.com Abstract. We introduced a real time Image Processing technique using modern programmable Graphic Processing Units (GPU) in this paper. GPU is a SIMD (Single Instruction, Multiple Data) device that is inherently data-parallel. By utilizing NVIDIA’s new GPU Programming framework, “Compute Unified Device Architecture” (CUDA) as a computational resource, we realize significant acceleration in the computations of different Image processing Algorithms. Here we present an efficient implementation of algorithms on the NVIDIA GPU. Specifically, we demonstrate the efficiency of our approach by a parallelization and optimization of the algorithm. In result we show time comparison between CPU and GPU implementation. Keywords: GPU, CUDA, Image blending 1 Introduction Most powerful CPUs having multi-core processing power are not capable to attain Real-time image processing. Increasing resolution of video captures devices and increased requirement for accuracy make it is harder to realize real-time performance. Recently, graphic processing units have evolved into an extremely powerful computational resource. For example, The NVIDIA GeForce GTX 280 is built on a 65nm process, with 240 processing cores running at 602 MHz, and 1GB of GDDR3 memory at 1.1GHz running through a 512-bit memory bus. Its Peak processing power is 933 GFLOPS [1], billions of floating-point operations per second, in other words. As a comparison, the quad-core 3GHz Intel Xeon CPU operates roughly 96 GFLOPS [2]. The annual computation growth rate of GPUs is approximately up to 2.3x. In contrast to this, that of CPUs is 1.4x [2]. At the same time, GPU is becoming cheaper and cheaper. As a result, there is strong desire to use GPUs as alternative computational platforms for acceleration of computational intensive tasks beyond the domain of graphics applications. To support this trend of GPGPU (General-Purpose Computing on GPUs) computation [3], graphics card vendors have provided programmable GPUs and high-level languages to allow developers to generate GPU-based applications.