CITK - an architecture and examples of CUDA enabled ITK filters Release 0.00 Richard Beare 1 , Daniel Micevski, Chris Share Luke Parkinson, Phil Ward, Wojtek Goscinski1, Mike Kuiper 2 May 25, 2011 1 Richard.Beare@monash.edu, Monash University, Melbourne, Australia 2 mike@vpac.org, Victorian Partnership for Advanced Computing, Melbourne, Australia. Abstract There is great interest in the use of graphics processing units (GPU) for general purpose applications because the highly parallel architectures used in GPUs offer the potential for huge performance increases. The use of GPUs in image analysis applications has been under investigation for a number of years. This article describes modifications to the InsightToolkit (ITK) that provide a simple architecture for transparent use of GPU enabled filters and examples of how to write GPU enabled filters using the NVIDIA CUDA tools. This work was performed between late 2009 and early 2010 and is being published as modifications to ITK 3.20. It is hoped that publication will help inform development of more general GPU support in ITK 4.0 and facilitate experimentation by users requiring functionality of 3.20 or wishing to pursue CUDA based developments. Contents 1 Introduction 2 2 CITK Architecture 2 2.1 Weaknesses ........................................... 3 3 Installation and building 4 3.1 CUDA compiler and software development kit ......................... 4 3.2 Fetch this contribution from google code ............................ 4 3.3 Patch ITK 3.20 ......................................... 4 3.4 Build and install modified ITK ................................. 4 3.5 Build examples ......................................... 5 3.6 Changes to standard processes for building ITK applications ................. 5 4 Anatomy of a CUDA enabled filter 5 4.1 Memory management ...................................... 5 4.2 Templated kernel files ...................................... 5