An Experimental Analysis of Saliency Detection with respect to Three Saliency Levels Antonino Furnari, Giovanni Maria Farinella, Sebastiano Battiato {furnari,gfarinella,battiato}@dmi.unict.it Department of Mathematics and Computer Science - University of Catania Abstract. Saliency detection is a useful tool for video-based, real-time Computer Vision applications. It allows to select which locations of the scene are the most relevant and has been used in a number of related assistive technologies such as life-logging, memory augmentation and ob- ject detection for the visually impaired, as well as to study autism and the Parkinson’s disease. Many works focusing on diﬀerent aspects of saliency have been proposed in the literature, deﬁning saliency in diﬀerent ways depending on the task. In this paper we perform an experimental anal- ysis focusing on three levels where saliency is deﬁned in diﬀerent ways, namely visual attention modelling, salient object detection and salient object segmentation. We review the main evaluation datasets specifying the level of saliency which they best describe. Through the experiments we show that the performances of the saliency algorithms depend on the level with respect to which they are evaluated and on the nature of the stimuli used for the benchmark. Moreover, we show that the eye ﬁxation maps can be eﬀectively used to perform salient object detection and segmentation, which suggests that pre-attentive bottom-up infor- mation can be still exploited to improve high level tasks such as salient object detection. Finally, we show that benchmarking a saliency detec- tion algorithm with respect to a single dataset/saliency level, can lead to erroneous results and conclude that many datasets/saliency levels should be considered in the evaluations. Keywords: saliency detection, visual attention modelling, salient ob- ject detection, salient object segmention, saliency levels, datasets for saliency evaluation 1 Introduction During the last decades, we have observed the wide spread of aﬀordable elec- tronic devices capable of acquiring and processing images. This has virtually enabled a series of real-time Computer Vision applications which can rely on the large amount of data constantly gathered from the environment. Among these technologies, in particular, wearable devices provided with both computational power and a number of sensors (often including one or more cameras) are re- cently gaining more and more popularity. Since they involve egocentric vision,