Object Oriented Framework for CUDA based Image Processing Pritam Prakash Shete 1# , Venkat P. P. K. 2# , Dinesh M. Sarode 3# , Mohini Laghate 4# , S. K. Bose 5# & R. S. Mundada 6# # Bhabha Atomic Research Centre, Mumbai, India, 400085 {ppshete 1 , panikv 2 , dinesh 3 , mlaghate 4 , bose 5 & rsm 6 }@barc.gov.in Abstract— In this paper, we propose and implement an object oriented framework for the GPU based image processing. Compute Unified Device Architecture i.e. the CUDA is a novel and promising GPU programming framework from the NVIDIA. The CUDA has been used to speedup many computationally intensive graphics as well as non graphics applications, but it requires more than just kernel programming. A pyramidal image blending algorithm is essential for a seamless panoramic mosaic. We introduce an object oriented framework for the CUDA based pyramidal image blending using software engineering principles and design patterns. We illustrate a set of design patterns, which assist in reusing an existing functionality. We show that use of design patterns facilitate extending existing functionality by adding new classes, rather than modifying an existing classes or functionality. We also talk about extending our framework for computation using the GPU texture memory. We talk about the framework’s performance in terms of programming efforts and a speedup factor achieved. Keywords- Object oriented framework; CUDA; design patterns; image processing. I. INTRODUCTION Graphics Processing Units i.e. GPUs are high performance multi-core processors with very high memory bandwidth. Recently the NVIDIA has launched the GeForce GTX 580 [1], which is having 16 multi-processors, each with 32 CUDA cores resulting in total of 512 CUDA cores. GPUs offer much attractive solution to accelerate wide range of graphics as well as non graphics applications. “Compute Unified Device Architecture” i.e. CUDA from the NVIDIA is a novel promising framework to facilitate programming on GPUs without requiring any graphics pipeline related knowledge. The CUDA has been used to speedup computationally intensive applications like computational biology, cryptography and many more by an order of magnitude or even more. However, GP-GPU using the CUDA framework is complex and requires more than just simple kernel programming. In recent years, many GPU based libraries and frameworks for image processing and computer vision have been developed. The OpenVIDIA [10] is written in Cg (C for Graphics) which accelerates various image processing algorithms like Hough transform, image registration, locating and tracking features etc. It can employ a single or multiple GPUs and comes with an interface for a FireWire camera. The MinGPU [11] encapsulates all GPU and OpenGL related code inside class hierarchies to facilitate a GP-GPU using Cg, GLSL or HLSL without requiring any graphics knowledge. The NVIDIA Performance Primitives library (NPP) [12] is a collection of a GPU accelerated common primitives for image, video and signal processing which can deliver 5x to 10x faster performance, while reducing overall development time. The GpuCV [14] is compatible with the OpenCV [13] interface and provides seamless GPU acceleration for image processing and computer vision applications. It transparently uses the GLSL and the CUDA framework without going into details of low level of GPU complexity. These libraries and frameworks facilitate a high performance computing using GPUs. However, many of theses are implemented using a procedural programming and lack benefits of an object oriented concepts. Moreover many of these are not built upon a reusable core and/or are limited to use of the OpenGL shaders and lacks benefits of the CUDA framework. Seiller Nicolas et al. [8] have applied software engineering design patterns to the CUDA based image processing. They have isolated image processing algorithms from an image data structure by using the Strategy pattern [5]. They have cascaded multiple algorithms by using the Composite pattern [5]. Use of design patterns for the GPU based image processing is their significant contribution. Although they miss key design patterns like the Abstract Factory pattern [5] and the Factory Method pattern [5] etc. Brijmohan Daga et al. [7] have realized good acceleration in development of a pyramidal image blending algorithm [3] using the CUDA framework. However their implementation lacks better CPU-GPU memory IO for the image data. The efficiency of this algorithm can be improved by storing an input as well as intermediate image data into the GPU memory. This avoids unnecessary CPU-GPU memory IO, resulting in better utilization of the memory bandwidth. In our earlier research work [6] we have implemented an object oriented framework for the pyramidal image blending algorithm using the CUDA framework. We have applied the Abstract Factory pattern to encapsulate different realizations of the algorithm as separate abstract factories, providing virtual constructor capability for a suit of image processing operations. However this implementation also fails to notice other significant design patterns like the Visitor pattern, the Template Method pattern and many more. 2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India 978-1-4577-2078-9/12/$26.00©2011 IEEE 1