550 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 4, JUNE 2000
3-D Scene Reconstruction with Viewpoint Adaptation
on Stereo Displays
André Redert, Emile Hendriks, and Jan Biemond, Fellow, IEEE
Abstract—In this paper, we propose a generic algorithm for
the geometrically correct reconstruction of 3-D scenes on stereo
displays with viewpoint adaptation. This forms the basis of
multiviewpoint systems, which are currently the most promising
candidates for real-time implementations of 3-D visual communi-
cation systems. The reconstruction algorithm needs 3-D tracking
of the viewers’ eyes with respect to the display. We analyze the
effect of eye-tracking errors. A simple bound will be derived,
below which reconstruction errors cannot be observed. We design
a multiviewpoint system using a recently introduced image-based
scene representation. The design formed the basis of the real-time
multiviewpoint system that was recently built in the European
PANORAMA project. Experiments with both natural and syn-
thetic scenes show that the proposed reconstruction algorithm
performs well. The experiments are performed by computer
simulations and the real-time PANORAMA system.
Index Terms—Motion parallax, multiviewpoint system, stereo
displays, viewpoint adaptation, 3-D scene reconstruction.
I. INTRODUCTION
S
TEREO and 3-D systems are emerging rapidly in the
area of human visual communication. Applications can
be found in medical areas (remote expert consultancy during
operations), industrial areas (inspection in hazardous envi-
ronments), and interpersonal communication in which the
telepresence is enhanced.
In “through-the-window” based systems, in which the scene
is reconstructed by a 2-D display [28], ideal scene reconstruc-
tion can be achieved in by holograms, in principle. They allow
any number of viewers simultaneously, and provide for each of
them the stereoscopic depth cue (a different image presented to
each eye), the lens accommodation cue (focal length of the eye
lens is related to the depth of the chosen object of interest), and
the motion parallax cue (scene viewpoint changes when viewer
moves). However, the current state of technology does not allow
real-time holographic video acquisition systems.
For implementations of real-time 3-D communication sys-
tems, stereo and multiviewpoint systems are currently the most
promising candidates [8]. These systems aim at providing as
much of the aforementioned cues as possible while minimizing
geometric and photometric reconstruction errors.
Manuscript received March 15, 1999; revised September 30, 1999. This work
was supported by the European PANORAMA project. This paper was recom-
mended by Guest Editor M. G. Strintzis.
A. Rert was with the Information and Communication Theory Group, Delft
University of Technology, 2628 CD Delft, The Netherlands. He is now with
Philips Research Laboratories, 5656 AA, Eindhoven, The Netherlands.
E. Hendriks and J. Biemond are with the Information and Communication
Theory Group, Delft University of Technology, 2628 CD Delft, The Nether-
lands.
Publisher Item Identifier S 1051-8215(00)04887-4.
Fig. 1. Geometric distortion in a stereo system.
In a stereo system, the scene is captured by two normal cam-
eras, the images of which are transmitted and shown on a stereo
CRT or LCD-like display [3]. By some means, the two images
are projected separately on the left and right eye. Such a stereo
system provides the stereoscopic depth cue, which gives some
impression of depth in the scene. The accommodation cue does
not work, since the lense of the human eye focuses on the dis-
play, regardless of the depth of the object on which the viewer
focuses his attention. This results in a conflict between the con-
vergence of the two eyes and the accommodation, causing visual
strain [9], [21].
The motion parallax cue also does not appear, since the shown
images do not depend on the position of the viewer. At most, one
specific stereo viewpoint provides a geometrically undistorted
scene reconstruction [1], [12] (see Fig. 1). Any movement away
from this position results in geometric distortion of the oberved
scene. The distortion depends nonlinearly on both viewing po-
sition and scene point positions. It yields distortion of position,
angles and scale (e.g., the so-called “puppet theater effect” [12]).
The minimization of visual strain due to such distortions is
not easy. It requires a careful setup of cameras and display [1],
[9], [21], guided by subjective tests that provide information
about the resilience of human vision to the distortions.
Multiviewpoint systems reduce distortions shown in Fig. 1
by the introduction of motion parallax. They provide images to
the viewer that change with viewing position. Two different ap-
proaches can be seen. First, displays exist that provide a limited
number of viewpoints simultaneously [7], [28]. These displays
may serve multiple viewers. However, the freedom of move-
ment is restricted (mostly only horizontal) and the motion par-
allax is not continuous but discrete.
The second type of multiviewpoint system provides only a
single stereo-image pair to a single viewer (see Fig. 2). In this
system, first a model of the scene needs to be acquired, e.g., on
the basis of stereo-image capturing and analysis. After transmis-
sion of the model, a new stereo pair is synthesized at the recon-
struction side. The motion parallax cue is then provided by con-
tinuously adapting the displayed stereo images to the current eye
1051–8215/00$10.00 © 2000 IEEE