Learn to Track Edges
Yanghai Tsin Yakup Genc Ying Zhu Visvanathan Ramesh
Real-Time Vision and Modeling Department
Siemens Corporate Research, Princeton, NJ, 08540, USA
Abstract
Reliability of a model-based edge tracker critically de-
pends on its ability to establish correct correspondences
between points on the model edges and edge pixels in an
image. This is a non-trivial problem especially in the pres-
ence of large inter-frame motions and in cluttered environ-
ments. We propose an online learning approach to solving
this problem. An edge pixel is represented by a descriptor
composed of a small segment of intensity patterns. From
training examples the algorithm utilizes the randomized for-
est model to learn a posteriori distribution of correspon-
dence given the descriptor. In a new frame, the edge pixels
are classified using maximum a posteriori (MAP) estima-
tion. The proposed method is very powerful and it enables
us to apply the proposed tracker to many previously impos-
sible scenarios with unprecedented robustness.
1. Introduction
The goal of this work is to design a precise and ro-
bust model-based edge tracker. Typical applications of
the tracker include robot localization, robotic assembly and
augmented reality. Edge tracking is an important technique
in these areas largely because it is view-independent and it
is resilient to moderate appearance changes. In some cases
it is the only viable choice when there are no strong cor-
ner features in a scene. Edge tracking has been studied
intensively since the inception of the computer vision re-
search. However, a reliable tracker is still unavailable and
researchers still deem edge trackers “brittle” [24]. They
usually cannot survive the harsh conditions in real appli-
cations where clutter, large appearance change and fast mo-
tions are ubiquitous.
The main difficulty in edge tracking is to establish cor-
respondences between points on the model edges and the
edge pixels in an image. To fully appreciate the difficulty
involved, we give an example in Figure 1. We would like
to direct the reader’s attention to regions A and B marked
on the top image and their corresponding areas in the cen-
A
B
Figure 1. Top: input image with the model projection superim-
posed as red lines. Middle: Canny edge detector results (Matlab
implementation). Bottom: the model-associated edge pixels clas-
sified using the proposed approach.
ter one. Region A contains complex structures. Many ir-
relevant edge pixels are detected by an edge detector [6].
Identifying true correspondences becomes error-prone sim-
ply because there are too many possible candidates. The
case is opposite for the line marked by B. The edge pixels
corresponding to the model are missing. Almost all tracking
failures can be traced back to false correspondences caused
by the above two cases, clutter or missing edge pixels. In
dynamic environments, the problem becomes even harder
due to many factors, e.g., lighting change, motion blur, scale
variation and camera gain control.
Difficulties involved in binary edge detection can be par-
tially alleviated by using intensity gradient [11]. Inten-
sity gradient incorporates more context than a binary edge
map. However, in cluttered environments these methods en-
counter similar complexity and they usually converge to a
local minimum.
1
978-1-4244-1631-8/07/$25.00 ©2007 IEEE