Recognizing Context for Annotating a Live Life Recording Nicky Kern, Bernt Schiele Department of Computer Science Darmstadt University of Technology, Germany {nicky.kern,schiele}@informatik.tu-darmstadt.de Albrecht Schmidt Embedded Interaction Research Group Universit¨ at M ¨ unchen, Germany albrecht.schmidt@acm.org Abstract In the near future it will be possible to continuously record and store the entire audio-visual lifetime of a person together with all digital information that person perceives or creates. While the storage of this data will be possible soon, retrieval and indexing into such large data sets is an unsolved challenge. Since today’s retrieval cues seem insuf- ficient we argue that additional cues, obtained from body- worn sensors, make associative retrieval by humans possi- ble. We present three approaches to create such cues, each along with an experimental evaluation: the users physical activity from acceleration sensors, his social environment from audio, and his interruptability from multiple sensors. Keywords: Context-Awareness, Information Retrieval, Sensing Systems, Context Recognition, Wearable Comput- ing 1 Introduction Technologies for capturing and creation are becoming ubiquitously deployed. E.g. smart phones that allow im- age, video, and audio capture are common; word process- ing and email are by now standard ways for creating text and for communication. Also many people everyday are reading, scanning and perceiving huge amounts of data and information in the office, at home or on the move through various means such as the web, documents or newspapers. Collecting information is common for personal use as well as in professional environments. Considering hob- byists, collecting photos and video are a central issue. In the professional domain the types of documents that are collected vary very much. For example in the med- ical profession large amounts of information are collected ranging from written and audio notes, to X-ray and com- puter tomography (CT) images and movies, to sensor data from Electro-Encephalogram (EEG) and Electrocardiogram (ECG). Advances in digital technology enable people to gather and conveniently store massive amounts of data. In particu- lar four different ways for accumulating information can be distinguished. • capture - acquiring images, audio, video and sensor data. • creation - writing of text, drawing images and plans. • download - obtaining information from online sources such as the WWW and storing the retreived data lo- cally. • communication - information created in the process of communication, resulting in email archives, chat logs, video and audio archives. The separation is more on a conceptual level as on a tech- nical. It is interesting to observe that with current technolo- gies many of these processes are taking place in settings where the user is mobile or in a particular environment. The context in which data is gathered can be an interesting and vital resource for further use of such data [1]. In our view context describes the situation in the real world in which data is acquired. This may include the location, the social environment, the activity, and the physical environment, as suggested in Schmidt et al. [2]. Besides annotating audio- visual data it becomes apparent that sensor data collected (e.g. ECG) can benefit from meta-information on activity. Automated annotation of data gathered based on sensor in- formation is the central approach that we describe in this paper. 1