Direct 3-D Shape Recovery from Image Sequence Based on
Multi-scale Bayesian Network
Norio Tagawa Junya Kawaguchi Shoichi Naganuma Kan Okubo
Tokyo Metropolitan University, 6-6 Asahigaoka, Hino-shi, Tokyo 191-0065, Japan
tagawa@sd.tmu.ac.jp
Abstract
We propose a new method for recovering a 3-D ob-
ject shape from an image sequence. In order to recover
high-resolution relative depth without using the com-
plex Markov random field (MRF) that includes a line
process, we construct a recovery algorithm based on a
belief propagation scheme using a multi-scale Bayesian
network. With this algorithm, relative 3-D motion be-
tween a camera and an object can be determined to-
gether with relative depth, and the maximum a posteri-
ori expectation-maximization (MAP-EM) algorithm is
effectively used to determine a suitable approximation.
1. Introduction
We propose a method for obtaining 3-D depth in-
formation using a gradient based scheme with two suc-
cessive images. In this field of study, spatially dense
and stable detection is strongly required [1]–[3], and
the aperture problem and the alias problem need to be
completely solved [4]. Usually, either local optimiza-
tion or global optimization is used to avoid the aperture
problem. To avoid the alias problem, components of
low spatial frequency are extracted by low-pass filter-
ing and used to compute optical flow. However, these
techniques lower the resolution of the obtained optical
flow and hence, relative depth.
In this study, we attempt to directly recover 3-D
depth information without explicitly detecting optical
flow, and apply Bayesian network spreading to a res-
olution direction by decomposing the original image
into multi-scale images. Unknown parameters are rep-
resented as a node as well as depth to be estimated
and observed image information. We call this graphi-
cal model a multi-scale Bayesian network. If the pa-
rameters, including relative 3-D motion parameters, are
determined in advance, the inference of depth in this
network is realized by Kalman filtering. Especially,
for optical flow detection, Simoncelli [4] introduced the
multi-scale Bayesian network, which considers optical
flow as a node with parameters assumed to be known,
and proposed the Kalman filter-based algorithm. In our
study, we attempt to estimate the depth and parameters
simultaneously from observations.
The parameters to be estimated are common to all
multi-scale images, and hence, we have to adopt a suit-
able approximation to simplify the inference. In most
tractable approximations, the parameters are considered
to be independent between multi-scale images. How-
ever, the information for the parameters obtained in a
low-resolution image is not directly propagated; that is,
it is implicitly propagated through the propagation of
depth information. We propose a stable procedure using
the maximum a posteriori expectation-maximization
(MAP-EM) algorithm, which can directly propagate the
parameters’ information.
2. Gradient method for recovering depth
2.1. Projection model and optical flow
We use perspective projection as our camera-
imaging model. The camera is fixed with an (X,Y,Z )
coordinate system, where the viewpoint (lens center) is
at origin O and the optical axis is along the Z -axis. The
projection plane (image plane) Z =1 can be used with-
out any loss of generality, which means that the focal
length equals 1. A space point (X,Y,Z ) on the ob-
ject is projected to image point (x, y). At each (x, y),
the optical flow [v
x
,v
y
]
⊤
is formulated with an inverse
depth d(x, y) ≡ 1/Z (x, y) and the camera’s transla-
tional and rotational vectors u =[u
x
,u
y
,u
z
]
⊤
, and
r =[r
x
,r
y
,r
z
]
⊤
, respectively, as follows:
v
x
= xyr
x
− (1 + x
2
)r
y
+ yr
z
− (u
x
− xu
z
)d, (1)
v
y
= (1 + y
2
)r
x
− xyr
y
− xr
z
− (u
y
− yu
z
)d. (2)
978-1-4244-2175-6/08/$25.00 ©2008 IEEE