Dataset for testing and training of map-matching algorithms Matˇ ej Kubiˇ cka, Arben Cela, Philippe Moulin, Hugues Mounier, and S.I. Niculescu Abstract— We present a large-scale dataset for testing, bench- marking, and offline learning of map-matching algorithms. For the first time, a large enough dataset is available to prove or disprove map-matching hypotheses on a world-wide scale. There are several hundred map-matching algorithms published in literature, each tested only on a limited scale due to difficulties in collecting truly large scale data. Our contribution aims to provide a convenient gold standard to compare various map-matching algorithms between each other. Moreover, as many state-of-the-art map-matching algorithms are based on techniques that require offline learning, our dataset can be readily used as the training set. Because of the global coverage of our dataset, learning does not have to be be biased to the part of the world where the algorithm was tested. I. I NTRODUCTION Many map-matching algorithms have been published so far, but there is no standard methodology to estimate their performance. Many authors test their contributions only on simulations. Those who perform field testing most often commit tests that are limited in size, usually without compar- ison to other algorithms. Reasons for this are twofold: first, it is not immediately clear how to estimate performance of such algorithm (as mentioned in an early paper by White et al. [4]) and secondly, until recently it was cost prohibitive to collect a large-enough dataset for robust testing. A. Problem statement We are provided with a track and a map and we wish to obtain a route. A track is a finite, ordered set of geopoints, where each geopoint has an assigned position on Earth and a timestamp. A map is modeled as a directed graph consisting of two sets of nodes and arcs. Nodes have assigned position on Earth and arcs represent linear road segments between two nodes. Note that our definition of a map is the simplest rep- resentation of a road network. In particular, such graph embeds curvature of streets as well as one-way restrictions in its topology. Some authors use slightly different model where each arc is a curved street with its shape encoded as attributes. A route is a contiguous sequence of arcs in a map on which a vehicle is traveling. The map-matching problem Matˇ ej Kubiˇ cka, Hugues Mounier and S.I. Niculescu are with Laboratoire des Signaux et Systemes, CNRS/Supelec, 91192, Gif- Sur-Yvettes Cedex, France. matej.kubicka@lss.supelec.fr, hugues.mounier@lss.supelec.fr, silviu.niculescu@lss.supelec.fr. Arben Cela is with UPE, ESIEE Paris, 93162, Noisy-Le-Grand Cedex, France, arben.cela@esiee.fr Philippe Moulin is with IFP Energies nouvelles, 1 & 4, avenue de Bois-Pr´ eau, 92852 Rueil-Malmaison Cedex - France, philippe.moulin@ifpen.fr (a) problematic track feature example (b) positioning system error example (c) map error example (d) parking lot Fig. 1. examples of problematic situations deals with matching a track to a map in order to obtain a route. Specifically, we match a track to a sequence of contiguous arcs on a map. As each arc represents a part of a street, our goal is to recover a sequence of streets on which the vehicle travels. This task is essential for a variety of problems such as routing, location aware services, and floating car data systems. The incongruence between the track and the route is often referred to as “spatial mismatch”. Map-matching is then a method to correct this mismatch. B. Motivation The main motivation for this work stems from our inability to properly investigate properties of our own algorithm [3]. The approach we took was based on a small set of test runs in rural and urban areas with a predetermined route. We have ran our algorithm on tracks collected on these test runs and observed that the resulting route was matched correctly. It 2015 IEEE Intelligent Vehicles Symposium (IV) June 28 - July 1, 2015. COEX, Seoul, Korea 978-1-4673-7266-4/15/$31.00 ©2015 IEEE 1088