1 Social Signal Processing for Surveillance Marco Cristani, Dong Seon Cheng Automated surveillance of human activities has traditionally been a Computer Vision field interested in the recognition of motion patterns and in the production of high-level descriptions for actions and interac- tions among entities of interest (Cedras and Shah, 1995; Aggarwal and Cai, 1999; Gavrila, 1999; Moeslund et al., 2006; Buxton, 2003; Hu et al., 2004; Turaga et al., 2008; Dee and Velastin, 2008; Aggarwal and Ryoo, 2011; Borges et al., 2013). In the last five years, the study on human activities has been revitalized by addressing the so-called social signals (Pentland, 2007). In fact, these nonverbal cues inspired by the social, affective, and psychological literature (Vinciarelli et al., 2009b), have al- lowed a more principled understanding of how humans act and react to other people and to their environment. Social Signal Processing (SSP) is the scientific field making a system- atic, algorithmic and computational analysis of social signals, drawing significant concepts from anthropology and social psychology (Vincia- relli et al., 2009b). In particular, SSP does not stop at just modeling human activities, but aims at coding and decoding human behavior. In other words, it focuses on unveiling the underlying hidden states that drive one to act in a distinct way, with particular actions. This challenge is supported by decades of investigation in human sciences (psychology, anthropology, sociology, etc.) that showed how humans use nonverbal behavioral cues like facial expressions, vocalizations (laughter, fillers, back-channel, etc.), gestures or postures to convey, often outside con- scious awareness, their attitude towards other people and social envi- ronments, as well as emotions (V.Richmond and J.McCroskey, 1995). The understanding of these cues is thus paramount in order to under- stand the social meaning of human activities. The formal marriage of automated video surveillance with Social Sig-