HYBRID STOP DISCOVERY IN TRAJECTORY RECORDS Le Hung TRAN Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland hung.tranle@epfl.ch Tran Khanh DANG, Nam THOAI Ho Chi Minh City University of Technology VNUHCM, Vietnam {khanh, nam}@cse.hcmut.edu.vn Abstract— The advance of GPS tracking technique brings a large amount of trajectory data. These data can be used in many application domains such as trafc management, urban planning, tourism, and bird migration. Recently, a semantic model which expresses trajectory as a sequence of stops and moves was introduced and become a hot topic for trajectory data analysis. Stops are important parts of trajectories, such as “working at office”, “shopping in a mall”, “waiting for the bus”. Although several works have been developed to discover stops, they considered the characteristics of the stops separately. Because of this limitation, these approaches only focus on certain well-defined trajectories. They cannot work well for heterogeneous cases like diverse and sparse trajectories. Towards stop discovery in trajectories, in this paper, we propose a comprehensive hybrid feature-based method to discover stops. We also evaluate our approach with real-life GPS datasets, and show that this newly proposed approach can provide a good abstraction on the trajectory, with efficient computation. Keywords—Location-based services; stop discovery; trajectory records; spatio-temporal data; context awareness; mobile aware applications; data mining I. INTRODUCTION In recent years, there has been a tremendous surge in applications and services with location feeds. From that, trajectories become ubiquitous in many mobile aware applications and grow to be a huge data source. To better understand such mobility data, many data mining techniques have been applied in data abstraction and discovering interesting mobility patterns. They are clustering [12], classification [13], outlier detection [14], finding convoys [10], and sequential rule-driven pattern mining [9], over real-life GPS datasets. A recent work on the semantic perspective of trajectory was introduced in [19]. This approach defines a conceptual model for trajectories. The specific concern is to model trajectories with semantic annotations, allowing users to define semantic data to specific parts of the trajectory which are called stops. Stops are the important places where trajectory has passed and stayed for a while. Let us see Fig. 1 for instance: the dots show the original GPS points that a trajectory recorded; the four circles show the important places where this trajectory has stayed. With the result of such stop discovery method, we can explain the trajectory in a more meaningful way instead of the initial GPS (x, y, t) trace: the tracking user started from home, went to University for work, after off-duty he went shopping in COOP for a while, and finally reached Home. Generally, the benefits of stop discovery are identified and listed as follows: (1) Easily understandable: A sequence of stops can provide a better abstracted view for understanding mobility trace, rather than the original sequence of (x, y, t) points; (2) Efficient data compression: Instead of keeping the whole mobile tracking points, mobility data can be represented in terms of a sequence of stops; and (3) Automatic stop computation: These important parts of trajectories (stops) can be computed automatically and efficiently, based on the relevant trajectory data discretization/segmentation methods, as the focus of this paper. Fig. 1. Stops in an example trajectory. However, existing works like [2][16][24][29] only focus on well-defined trajectories like movement of vehicle and taxi, not working well for heterogeneous cases like diverse and sparse trajectories. Therefore, our research targets at a robust and efficient stop discovery algorithm, which can work well for different kind and quality of trajectory datasets of diverse moving objects as well as explore more challenging issues with additional characteristics of stops. Thus, the crucial objective of our stop discovery algorithms is to work robustly for the heterogeneous trajectory datasets. Overall, our main contribution in this paper is twofold: (1) Proposing an approach to model different features of trajectory into feature functions; and (2) Providing a framework that enables the combination of these feature functions and applies them to discover trajectory stops base on the DBSCAN principle. In our study, we use a brute-force approach with proper statistic and analysis to adjust these parameters. How to tune these feature’s parameters effectively and efficiently is out of the scope of this paper. The rest of this paper is organized as follows. In section 2, we briefly summarize the related work. Section 3 presents the problem statement and related definitions. Next, section 4 presents our proposed hybrid approach to feature-based stop discovery. Section 5 shows experimental results and discussions. Finally, section 6 presents concluding remarks as well as future work. 2013 24th International Workshop on Database and Expert Systems Applications 1529-4188/13 $26.00 © 2013 IEEE DOI 10.1109/DEXA.2013.6 9 2013 24th International Workshop on Database and Expert Systems Applications 1529-4188/13 $26.00 © 2013 IEEE DOI 10.1109/DEXA.2013.6 9