What Are We Looking For: Towards Statistical Modeling of Saccadic Eye
Movements and Visual Saliency
Xiaoshuai Sun, Hongxun Yao, Rongrong Ji
Dept. of Computer Science, Harbin Institute of Technology, Heilongjiang, China
{xiaoshuaisun, H.yao, rrji}@hit.edu.cn
Abstract
In this paper, we present a unified statistical frame-
work for modeling both saccadic eye movements and visual
saliency. By analyzing the statistical properties of human
eye fixations on natural images, we found that human at-
tention is sparsely distributed and usually deployed to lo-
cations with abundant structural information. This new ob-
servations inspired us to model saccadic behavior and vi-
sual saliency based on Super Gaussian Component (SGC)
analysis. The model sequentially obtains SGC using projec-
tion pursuit, and generates eye-movements by selecting the
location with maximum SGC response. Beside human sac-
cadic behavior simulation, we also demonstrated our supe-
rior effectiveness and robustness over state-of-the-arts by
carrying out dense experiments on psychological pattern-
s and human eye fixation benchmarks. These results also
show promising potentials of statistical approaches for hu-
man behavior research.
1. Introduction
Attention guided saccadic eye-moment is one of the most
important mechanisms in biological vision systems, based
on which the viewer is able to actively explore the environ-
ment with high resolution fovea sensors. Benefitting from
such unique behavior, human beings, as well as most pri-
mates, are able to efficiently process the information from
complex environments. For the last four decades, extensive
research works have been done by means of theoretical rea-
soning and computational modeling, trying to uncover the
principles that underlie the deployment of gaze. Compared
with theoretic hypotheses, computational models of visual
attention and saccadic eye-movement not only help us bet-
ter understand the mechanism of human cognitive behavior
but also provide us powerful tools to solve various vision
related problems such as video compression [1], scene un-
derstanding [2], object detection and recognition [3] etc.
In this paper, our goal is to establish a statistical frame-
Figure 1. What are we looking for when viewing a scene? Our s-
tudies suggest that the answer to this question could be revealed vi-
a statistical analysis of human eye fixations. One possible answer
named Super Gaussian Component is investigated in this paper.
work for both saccadic behavior simulation and visual
saliency analysis. Different with previous works that drew
inspirations from the existing neurobiological knowledge or
mathematical theories, we directly make assumptions based
on the statistical analysis of the ground truth human eye-
fixations. By means of statistical analysis, we try to find out
“what components in visual images draw fixations” which
is similar but more reachable compared with the traditional
question of “what properties draw attention”. The analy-
sis is conducted on eye fixation data captured from human
observers using an eye tracking device during task indepen-
dent free viewing of natural images. In such bottom-up s-
cenario, we have found an interesting phenomenon, which
might further be proved as a general principle, that stimuli
with a super Gaussian distribution is more likely to gather
human gaze. Based on this finding, human saccadic be-
havior can be modeled as a function of active information
pursuit targeting at the statistical components with desired
properties such as super Gaussianity.
In our framework (Figure 2), visual data is represented
as an ensemble of small image patches. Kurtosis maximiza-
tion is adopted to search for the Super Gaussian Component
978-1-4673-1228-8/12/$31.00 ©2012 IEEE 1552