Personal and Ubiquitous Computing manuscript No. (will be inserted by the editor) Wearable Sensing to Annotate Meeting Recordings Nicky Kern 1 , Holger Junker 2 , Paul Lukowicz 2 , Bernt Schiele 1 , Gerhard Tr¨ oster 2 1 Perceptual Computing and Computer Vision, Swiss Federal Institute of Technology (ETH) Zurich, Switzerland 2 Wearable Computing Lab, Swiss Federal Institute of Technology (ETH) Zurich, Switzerland The date of receipt and acceptance will be inserted by the editor Abstract We propose to use wearable computers and sensor systems to generate personal contextual annota- tions in audio visual recordings of meetings. In this paper we argue that such annotations are essential and effec- tive to allow retrieval of relevant information from large audio-visual databases. The paper proposes several use- ful annotations that can be derived from cheap and un- obtrusive sensors. It also describes a hardware platform designed to implement this concept, outlines approaches to extract annotations and presents first experimental results. Key words speaker segmentation, activity recogni- tion, wearable sensing, meeting annotations. 1 Introduction Interestingly, about 500 Tera Bytes of storage are suffi- cient to record all audio-visual information a person per- ceives during an entire lifespan 1 . This amount of storage will be available even for an average person in the not so distant future. A wearable recording and computing device therefore might be used to ’remember’ any talk, any discussion, or any environment the person saw. Today however, the usefulness of such data is limited by the lack of adequate methods for accessing and index- ing large audio-visual databases. Whereas humans tend to remember events by associating them with personal experience and contextual information, today’s archiv- ing systems are based solely on date, time, location and simple content classification. As a consequence even in a recording of a simple event sequence such as a short meeting, it is very difficult for the user to efficiently re- trieve relevant events. Thus for example the user might 1 assuming a lifespan of 100 years, 24h recording per day, and 10 MB per min recording results in approximately 500 TB remember a particular part of the discussion as being a heated exchange conducted during a short, unscheduled coffee break. However he is unlikely to remember the ex- act time of this discussion which is required to retrieve it in a typical audio-visual recording. In this paper we propose to use wearable sensors in order to enhance the recorded data with contextual, per- sonal information to facilitate user friendly retrieval. For this purpose wearable computers are particularly interesting since they allow a truly personal audio-visual record of the environment of a person. Using a hat- or glass-mounted camera and microphones attached to the chest or shoulders of the person enable a recording from a first-person perspective. Additionally, wearable sen- sors such as accelerometers and biometric sensors can enhance the recording with information on the user’s activity and physical state. That sensor information can be used to annotate and structure the data stream for later associative access. Obviously, automatically annotating and structuring the entire life-record of a person is an extremely ambi- tious and probably too general problem. Therefore, this paper deals with a more specific problem, namely the annotation of meetings, which, in itself, presents a very diverse setting. Most of us have many, maybe too many meetings every week. Using a wearable to record such meetings won’t make the meetings themselves more ef- ficient. However, it may allow the user to recall who he encountered, who discussed, who agreed or disagreed, and which arguments each participant made. It may also make it easier to reconstruct which, why, and how a de- cision was taken. Meetings may take place in a room instrumented with dedicated hardware. More generally however, meet- ings also take place outdoors or in a mobile setting. Fur- ther, important discussions may take place during the break or on the corridor. Wearable computers, which stay with the person all the time, are particularly well suited for this more general meeting scenario.