IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 12, NO. 5, SEPTEMBER/OCTOBER 2006
A Generic and Scalable Pipeline
for GPU Tetrahedral Grid Rendering
Joachim Georgii and R¨ udiger Westermann
Abstract— Recent advances in algorithms and graphics hardware have opened the possibility to render tetrahedral grids at interactive
rates on commodity PCs. This paper extends on this work in that it presents a direct volume rendering method for such grids which
supports both current and upcoming graphics hardware architectures, large and deformable grids, as well as different rendering
options. At the core of our method is the idea to perform the sampling of tetrahedral elements along the view rays entirely in local
barycentric coordinates. Then, sampling requires minimum GPU memory and texture access operations, and it maps efficiently
onto a feed-forward pipeline of multiple stages performing computation and geometry construction. We propose to spawn rendered
elements from one single vertex. This makes the method amenable to upcoming Direct3D 10 graphics hardware which allows to
create geometry on the GPU. By only modifying the algorithm slightly it can be used to render per-pixel iso-surfaces and to perform
tetrahedral cell projection. As our method neither requires any pre-processing nor an intermediate grid representation it can efficiently
deal with dynamic and large 3D meshes.
Index Terms—Direct volume rendering, unstructured grids, programmable graphics hardware
✦
1 I NTRODUCTION AND MOTIVATION
Although recent advances in graphics hardware have opened the pos-
sibility to efficiently render tetrahedral grids on commodity PCs, inter-
active rendering of large and deformable grids is still one of the main
challenges in scientific visualization. Such grids are more and more
frequently encountered in a number of different applications ranging
from plastic and reconstructive surgery, virtual training simulators to
fluid and solid mechanics.
The weakness of GPU-based volume rendering techniques for tetra-
hedral grids is, that these techniques do not effectively exploit the po-
tential of recent GPUs. The reason therefore lies in the re-sampling
process for tetrahedral elements. This process requires at every sam-
ple point the geometry of the element it is contained in. The geometry
is used to compute the points position in the local coordinate space of
the element. Most generally, an element matrix built from the elements
vertex coordinates is used for this purpose.
For every element this matrix only has to be computed once and can
then be used to re-sample the data at every sample point in its inte-
rior. To do so, a container storing the matrices of all elements has to
be created on the GPU. It is clear that this approach significantly in-
creases the memory requirements. Moreover, because the re-sampling
is performed in the fragment stage, every fragment needs to be as-
signed the unique identifier of the element it is contained in to address
the respective matrix. In scan-conversion algorithms this can only be
done by issuing these identifiers as additional per-vertex attributes in
the rendering of the tetrahedral elements. Unfortunately, because ev-
ery vertex is shared by many elements in general, a shared vertex list
can no longer be used to represent the grid geometry on the GPU. This
causes an additional increase in memory.
To avoid the memory overhead induced by pre-computations, element
matrices can be calculated in turn for every sample point. But then the
same computations, including multiple memory access operations to
fetch the respective coordinates, have to be performed for all sample
points in the interior of a single element, thereby wasting a signifi-
cant portion of the GPUs compute power. As before, identifiers are
• Joachim Georgii, E-mail: georgii@in.tum.de.
• R¨ udiger Westermann , E-mail: westermann@in.tum.de. All authors are
with the Computer Graphics & Visualization Group, Technische
Universit¨ at M¨ unchen
Manuscript received 31 March 2006; accepted 1 August 2006; posted online 6
November 2006.
For information on obtaining reprints of this article, please send e-mail to:
tvcg@computer.org.
required to access vertex coordinates, and thus a shared vertex array
cannot be used.
1.1 Contribution
In this paper we present a GPU pipeline for the rendering of tetrahe-
dral grids that avoids the aforementioned drawbacks. This pipeline is
scalable with respect to both large data sets as well as future graphics
hardware. The proposed method has the following properties:
• Per-element calculations are performed only once.
• Tetrahedral vertices and attributes can be shared in vertex and
attribute arrays.
• Besides the shared vertex and attribute arrays nearly no addi-
tional memory is required on the GPU.
• Re-sampling of (deforming) tetrahedral elements is performed
using a minimal memory footprint.
1.2 System Overview
To achieve our goal we propose a generic and scalable GPU rendering
pipeline for tetrahedral elements. This pipeline is illustrated in Figure
1. It consists of multiple stages performing element assembly, primi-
tive construction, rasterization and per-fragment operations.
Fig. 1. Overview of the GPU rendering pipeline.
To render a tetrahedral element the pipeline is fed with one single ver-
tex, which carries all information necessary to assemble the element
geometry on the GPU. This stage is described in Section 3.1. As-
sembled geometry is then passed to the construction stage where a
renderable representation is built.
The construction stage is explicitly designed to account for the func-
tionality on upcoming graphics hardware. With Direct3D 10 compli-
ant hardware and geometry shaders [1] it will be possible to create
additional geometry on the graphics subsystem. In particular, trian-
gle strips or fans composed of several vertices, each of which can be
1345
1077-2626/06/$20.00 © 2006 IEEE Published by the IEEE Computer Society