UNDERSTANDING DYNAMIC SCENES BY HIERARCHICAL MOTION PATTERN MINING
Lei Song
1
, Fan Jiang
2
, Zhongke Shi
1
, Aggelos K. Katsaggelos
2
1
School of Automation, Northwestern Polytechnical University, Xi’an, 710072, China
2
Dept of EECS, Northwestern University, Evanston, 60208, USA
songlei@mail.nwpu.edu.cn, {ffji295, aggk}@eecs.northwestern.edu, zkeshi@nwpu.edu.cn
ABSTRACT
Our work addresses the problem of analyzing and
understanding dynamic video scenes. A two-level motion
pattern mining approach is proposed. At the first level,
single-agent motion patterns are modeled as distributions
over pixel-based features. At the second level, interaction
patterns are modeled as distributions over single-agent
motion patterns. Both patterns are shared among video clips.
Compared to other works, the advantage of our method is
that interaction patterns are detected and assigned to every
video frame. This enables a finer semantic interpretation and
more precise anomaly detection. Specifically, every video
frame is labeled by a certain interaction pattern and moving
pixels in each frame which do not belong to any single-
agent pattern or cannot exist in the corresponding interaction
pattern are detected as anomalies. We have tested our
approach on a challenging traffic surveillance sequence
containing both pedestrian and vehicular motions and
obtained promising results.
Index Terms—Visual surveillance, LDA, motion
pattern analysis, anomaly detection
1. INTRODUCTION
In many surveillance scenarios, such as those involving a
crowded traffic scene, a busy train station, or a shopping
mall, various motions are involved. It is highly desirable to
analyze the motion patterns and obtain some high-level
interpretation of the semantic content. For example, in a
video monitoring traffic intersection, without any prior
knowledge about the traffic rules in the specific scene, it is
useful to discover typical vehicle behaviors and their
dependencies involved in this scene, and detect anomalous
motion for security concerns.
Motion patterns involved in a complex dynamic scene
usually have a hierarchical nature. Typically, many objects
(e.g., vehicles) are involved in the video scene. In terms of
__________________________
The work of A. K. Katsaggelos was supported in part by a grant
from the US Department of Energy (DE-NA0000431)
each single object, its motion might follow some regular
streams, which are single-agent motion patterns. In addition,
the co-occurrence of multiple objects at a same time might
also be subject to constraints, which define interaction
patterns. For example in the traffic intersection scenario, the
single-agent motion patterns are all the legal paths going
through this intersection (shown in Fig.1 (a) and numbered
from 1 to 7), while the interaction patterns are possible
combinations of paths determined by the traffic lights
(shown in Fig.1 (b) as combinations 1 and 2).
(a) (b)
Fig. 1. Single-agent motion patterns and interaction patterns
Considering this hierarchical nature of motion patterns,
many works on scene understanding and motion pattern
discovery are based on hierarchical modeling. One common
approach is based on object trajectory analysis. Typically,
objects are tracked in video and an analysis and mining
approach is applied to the object trajectories to discover
motion patterns. For example, Jiang et al. [2] use an HMM
to characterize object trajectories, and a BIC-based
dissimilarity measure is used for highly recurrent events
clustering. Duong et al. [3] introduce the Switching Hidden
Semi-Markov Model for atomic activity modeling, and the
high-level activities are modeled as a sequence of atomic
activities. Jiang et al. [4] characterize the crowded motion
by a patch-based local motion representation, and cluster all
patches into different motion patterns by spectral clustering.
Basharat et al. [5] detect abnormal events based on local and
global behavior of tracks. Instead of clustering tracks into
major paths, they build local pixel level probability density
functions that capture a variety of tracks.
However, object tracking methods are sensitive to
object detection, recognition and tracking errors, and they
usually fail in complicated or crowded scenes due primarily
978-1-61284-350-6/11/$26.00 ©2011 IEEE