A Topology-Based Concept for Contraction in Spatiotemporal Space 1) Adrian Ion, Yll Haxhimusa, and Walter G. Kropatsch Pattern Recognition and Image Processing Group Institute for Computer-Aided Automation Vienna University of Technology {ion,yll,krw}@prip.tuwien.ac.at Abstract: A concept relating story-board description of video sequences with spatio-temporal hierarchies build by local contraction processes of spatio-temporal relations is presented. Object trajectories are curves in which their ends and junctions are identiﬁed. Junction points happen when two (or more) trajec- tories touch or cross each other, which we interpret as the “interaction” of two objects. Trajectory connections are interpreted as the high level descriptions. 1 Introduction Even though there is no generally accepted deﬁnition of cognitive vision yet, presumptions about the cognitive capabilities of a system can be made by comparing it’s results with that of an entity, already ’known’ and accepted to have these capabilities, the human. Also, the Research Roadmap of Cognitive Vision [9], presents this emerging discipline as ’a point on a spectrum of theories, models, and techniques with computer vision on one end and cognitive systems at the other’. A conclusion drawn from the previous, is that a good starting point for a representation would bring together the following: i) enable easy extraction of data for human comparison; ii) bridge together high and low level abstraction data used for cognitive and computer vision processes. After ’watching’ a video of some complex action, one of the things, that we would expect a cognitive vision system to do, is to be able to correctly answer queries regarding the relative position of occluded objects. Let us take the video 2) given by a simple scenario of two black cups and a yellow ball and describe the scene in simple English words (see the description in Table 1). The description contains: objects: hand, cup, ball; actions: grasp, release, move, etc., and relations: to-the-left, to-the-right, etc. While observing a dynamic scene, an important kind of information is that of the change of an object’s location, i.e. the change of topological information. In most of the cases, this 1) This Work was supported by the Austrian Science Foundation under grants P14445-MAT, P14662-INF and FSP-S9103-N04. 2) http://www.prip.tuwien.ac.at/Research/FSPCogVis/Videos/Sequence 2 DivX.avi