Noname manuscript No. (will be inserted by the editor) Erik Smistad · Anne C. Elster · Frank Lindseth Real-time Gradient Vector Flow on GPUs using OpenCL the date of receipt and acceptance should be inserted later Abstract The Gradient Vector Flow (GVF) is a feature- preserving spatial diffusion of gradients. It is used exten- sively in several image segmentation and skeletonization algorithms. Calculating the GVF is slow as many iter- ations are needed to reach convergence. However, each pixel or voxel can be processed in parallel for each iter- ation. This makes GVF ideal for execution on Graphic Processing Units (GPUs). In this paper, we present a highly optimized parallel GPU implementation of GVF written in OpenCL. We have investigated memory ac- cess optimization for GPUs, such as using texture mem- ory, shared memory and a compressed storage format. Our results show that this algorithm really benefits from using the texture memory and the compressed storage format on the GPU. Shared memory, on the other hand, makes the calculations slower with or without the other optimizations because of an increased kernel complexity and synchronization. With these optimizations our im- plementation can process 2D images of large sizes (512 2 ) in real-time and 3D images (256 3 ) using only a few sec- onds on modern GPUs. Keywords Gradient Vector Flow · GPU · OpenCL 1 Introduction The Gradient Vector Flow (GVF) is a feature-preserving spatial diffusion of gradients. The GVF field is defined as the vector field V, that minimizes the energy function E: E(V)= µ|∇V(x)| 2 + |V 0 (x)| 2 |V(x) − V 0 (x)| 2 dx (1) Erik Smistad · Anne C. Elster · Frank Lindseth Dept. of Computer and Information Science Norwegian University of Science and Technology Sem Saelandsvei 7-9, NO-7491 Trondheim Tlf.: +47 73594475 E-mail: smistad@idi.ntnu.no Frank Lindseth SINTEF Medical Technology Fig. 1 Example of GVF execution. From left to right: Top: 1) Smoothed image. 2) Magnitude of image gradients V0 3) Magnitude of GVF after 10 iterations, 4) Magnitude of GVF after 400 iterations. Bottom: 1)Zoomed area of smoothed image 2, 3 and 4) Image gradients superimposed on zoomed image after 0, 10 and 400 iterations. where V 0 is the initial vector field. The GVF was introduced by Xu and Prince [11] as a new external force field for active contours (AC). Also known as snakes or deformable models, AC are curves that move in an image while trying to minimize its en- ergy and are used extensively for boundary detection and segmentation. The traditional snake introduced by Kass et al. [8] has the problem of getting stuck in boundary concavities and low capture range. The GVF snake can deal with these problems. Fig. 1 depicts the GVF when used for Active Con- tours. The initial image shown top-right is an image smoothed by convolution with a Gaussian. Next is the initial vector field V 0 displayed using vector magnitude in the top row and the vectors in a zoomed region below. This is a preprint. The final publication is available at link.springer.com.