ANALYSIS, RANKING AND PREDICTION IN PERVASIVE COMPUTING TRAILS D. Papadogkonas, G. Roussos, M. Levene School of Computer Science and Information Systems, Birkbeck College, University of London, London WC1E 7HX, U.K. Email: {dikaios,gr,mark}@dcs.bbk.ac.uk Keywords: pattern recognition, prediction, ranking Abstract Many pervasive computing applications involve the recording of user interaction with physical and digital resources in the environment. Such records can be used to establish context histories that can be subsequently used for user behaviour analysis, pattern recognition, prediction, and the provision of context aware services. In this paper we use trails as the principal data processing primitive for analysis and prediction. We define a trail as the sequence of recorded interactions with the pervasive computing space. Trails contain patterns of space usage and they can be used for the provision of different services, space usage analysis or sociological information of people using the environment simultaneously. Trail analysis requires considerable storage and computational resources to discover such patterns. Moreover no single method exists that identifies significant trails based on different metrics for a variety of different pervasive computing application. In this paper, we introduce a trail based analysis approach, an associated model for the representation of trails and trail aggregates, and suitable data structures for efficient storage, filtering and retrieval. Also, we propose several related algorithms and associated metrics for ranking and identifying significant trails. We use these techniques in 2 different case studies to extract valuable information about the pervasive system environment usage and evaluate the summarizability and the predictive power of our model. 1 Introduction Pervasive computing systems often provide facilities to record interactions between different devices or physical objects and users. In some cases, such recording is purposeful in that the aim of the research is to identify and analyze human social behaviour or understand utilization of the particular environment monitored. For example, Reality Mining [3] and Wireless Rope [12] attempt to understand interactions between users and fixed locations. Other projects, for instance Senseable City [17], use these interactions to analyze and describe the way we use cities. Moreover, in some cases the aim is not only to analyze but rather to affect user behaviour, for example in Cityware [13] researchers try to develop tools for deploying pervasive computing systems based on the relationship between space and user behaviour by recording Bluetooth devices at specific locations. All projects face common problems in understanding the captured data and analyzing these series of interactions in order to understand the systems usage or provide context aware services. Even though these and other projects work on the analysis of series of interaction in pervasive computing space no unified approach exist that allows system usage analysis, pattern recognition and prediction for a plethora of different applications under one probabilistic model. In this paper, we address the need for such a unified approach which allows the analysis of different pervasive system datasets. We propose a trail based probabilistic data model and a collection of algorithms which can be used to understand the use of space in a pervasive computing environment identify patterns and make predictions. A unique feature of our model is that it can extract patterns based on different metrics. The model can consider and weight different metrics that can be recorded from the pervasive system environment. These metrics can be time of interaction or any other kind of sensor reading. The techniques proposed are general and can be employed at any level of abstraction and incorporate whatever types of user or service interactions are deemed appropriate. We develop our approach based on the notion of landmark which we take to be the position of a significant entity within a landscape or any type of wireless resource a user can interact with. We capture interactions between users and landmarks by observing wireless communication between a user device and a device embedded in a landmark, although our methods do not depend on the specifics of the technology used and can cater for RFID, Wi-Fi, Bluetooth, or any other type of wireless sensor network technology. We organize series of interactions with landmarks into trails which contain both spatial and temporal information. In particular, we record the duration of each interactive session for each user and each