978-1-4673-0174-9/12/$31.00 ©2012 IEEE ICALIP2012 661 Accelerating Volume Ray Casting by Empty Space Skipping Used for Computer-Aided Therapy 1 Yinong Wang, 1 Weibei Dou, 2 Jean-Marc Constans 1 Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China 2 Unité d’IRM, CHU de Caen, 14033 Caen, France E-mail: wangyinongwww@163.com, douwb@tsinghua.edu.cn Abstract Volume ray casting (VRC) is one of image-based Direct Volume Rendering (DVR) techniques, a powerful tool for visualizing scalar data of three spatial dimensions, and can provide necessary visual perspective effect for volume data like CT or MRI. But the computational complexity becomes a bottle-neck for its application in Computer-Aided Therapy (CAT). Whatever, a parallel computing architecture, like Single-Instruction-Multiple-Data (SIMD), should be a good platform for implementing VRC. And yet, the computational speed depends on dataset, especially on the ratio of nonempty voxels and empty voxels which are defined by opacity value. This paper proposes an empty space skipping technique based on GPGPU for accelerating VRC. It includes two better strategies compared with other techniques: the encoding based pre-sampling from texture memory and disregard the empty voxels for reducing the generation of bounding box. The performance testing in term of frame per second, are done both on 4 general testing datasets and 2 brain tumor patients’ datasets. The results show that the proposed ameliorative strategies contribute about 2 times speedup compared with the non-skipping VRC on CUDA. 1. Introduction VRC [1] is one of the volume rendering techniques, which can get necessary perspective effect for volume data like CT or MRI. The algorithm is highly costly in time because of its large computation which limits the wide application. Many optimization techniques are proposed. Early ray termination [2][3][4] is based on the observation of human visual system. Ray casting can be terminated once opacity accumulates to a level or a ray has marched to a sufficient distance through a semitransparent object. Ray box technique [5][6][7] takes volume data space as a box, excluding the ray not hitting the volume. Voxels are classified as empty voxels and nonempty ones by empty space skipping technique [3][4], which fully exploits this feather to traverse the empty voxels space rapidly. . Empty space skipping technique firstly encodes the opacity information of volume data to detect whether one voxel is empty or not. If the voxel is empty, the ray will skips it with larger step size. This algorithm can obviously reduce the times accessing to volume data. Levoy [3] firstly introduced this method called hierarchical spatial enumeration. For GPU-based VRC, Kruger and Westermann [8] proposed a min-max octree encoding method which can efficiently get encoding data when the classification changes. A drawback of this method is that its opacity value of voxels only relevance to the scalar value based on volume data. However, recent transfer function design holds that opacity value of voxels is relevant to other factors, gradients, local information of images or transfer function generation models for example. Space leaping technique [9][10] calculates the skip value for each empty voxel, which costs extensive processing in encoding stage. Beside these techniques mentioned above, researchers also use hardware to accelerate ray casting algorithm. In the past, ray casting is accelerated with specialized shader language on former programmable GPU [8][11]. With CUDA (Compute Unified Device Architecture), one platform of GPGPU (General-Purpose computing on Graphics Processing Units), ray casting can be implemented just with C language on GPU, which provides developers with great convenience. NVIDIA published the first demo of ray casting on CUDA in 2008 [6][7]. This demo just realizes the basic ray casting algorithm, excluding any illumination models and any volume voxel-size information. Despite this, it shows great speedup to the traditional CPU-based ray casting method. Lukas Marsalek [5] implements ray casting with front-to-back traversal, pre-integration and early ray termination on Geforce 8800GT. His method gets more speedup by