∗ Dalhousie University, Institute for Big Data Analytics, Halifax, Canada. Email: amilcar.soares@dal.ca † ISTI-CNR, Pisa, Italy. Email: chiara.renso@isti.cnr.it ‡ Dalhousie University, Department of Computer Science, Halifax, Canada and Institute for Computer Science, Polish Academy of Sciences, Warsaw. Email: stan@cs.dal.ca ANALYTiC: An Active Learning System for Trajectory Classification A. Soares Júnior*, C. Renso † , and S. Matwin ‡ Abstract There is an increasing number of trajectories data becoming available by the tracking of various moving objects, like animals, vessels, vehicles and humans. However, these large collections of movement data lack semantic annotations, since they are typically done by domain experts in a time-consuming activity. A promising approach is the use of machine learning algorithms to try to infer semantic annotations from the trajectories by learning from sets of labeled data. This paper experiments active learning, a machine learning approach minimizing the set of trajectories to be annotated while preserving good performance measures. We test some active learning strategies with three different trajectories datasets with the objective of evaluating how this technique may limit the human effort required for the learning task. We support the annotation task by providing the ANALYTiC platform, a web-based interactive tool to visually assist the user in the active learning process over trajectory data. Keywords: Active Learning; Semantic Annotation; Trajectory Classification 1. Introduction We are witnessing a rapid increase in the use of positioning devices, from new generation smart-phones to GPS-enabled cameras, sensors, and indoor positioning devices. Thanks to the fact that these devices are becoming smaller and cheaper, many kinds of objects are nowadays tracked, like vehicles, vessels, animals, and humans. This results in huge volumes of spatio-temporal data which require dedicated methods to properly analyze them. In this context, there is a growing interest in the semantic enrichment of movement data: many application fields benefit from the synergic combination of pure geometrical spatio-temporal data with semantic information, denoted as semantic or annotated trajectories. Tourism, cultural heritage, traffic management, animal behavior or vehicle tracking, are just a few examples of studies benefiting from annotated trajectories [14]. However, methods to make explicit the semantic dimension of movement data are still lacking and finding methods to automatically or semi-automatically infer trajectory labels in large datasets is an ongoing challenge [10]. Machine learning is a promising direction for annotating trajectories with semantic labels by iteratively learning from labeled training sets (training sets) to build models (or classifiers) that are then applied to unlabeled data to obtain a labeled dataset with a given accuracy. Machine learning is extensively used in many prediction-based applications, where the predictions are purely numeric (e.g., velocity) or class based (e.g. low or high velocity), and where they can learn from large labeled training sets. In the case of the trajectory domain, when the inferred classes are semantic based (e.g. kind of transportation or activity performed), the availability of training data depends mainly on trajectories manually annotated by humans. These annotations are, however, difficult to obtain since they are extremely time-consuming for the domain expert who needs to annotate large trajectories datasets correctly. The question, therefore, is: is it possible to annotate trajectories automatically by analyzing their features, thus reducing the human effort involved in manually annotating them? However, a good performance generally requires that the training set is large, demanding a substantial effort from the domain expert to provide a sufficient number of examples to the classifier. We, therefore,