  Citation: Li, Y.-L.; Li, H.-T.; Chiang, C.-K. Multi-Camera Vehicle Tracking Based on Deep Tracklet Similarity Network. Electronics 2022, 11, 1008. https://doi.org/10.3390/ electronics11071008 Academic Editor: John Ball, Ning Wang Received: 7 December 2021 Accepted: 22 March 2022 Published: 24 March 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). electronics Article Multi-Camera Vehicle Tracking Based on Deep Tracklet Similarity Network Yun-Lun Li, Hao-Ting Li and Chen-Kuo Chiang * Advanced Institute of Manufacturing with High-Tech Innovations, Center for Innovative Research on Aging Society (CIRAS) and Department of Computer Science and Information Engineering, National Chung Cheng University, Minhsiung, Chiayi 621301, Taiwan; xu3mp6xjp6@gmail.com (Y.-L.L.); remidream@gmail.com (H.-T.L.) * Correspondence: ckchiang@cs.ccu.edu.tw; Tel.: +886-5-272-9111 Abstract: Multi-camera vehicle tracking at the city scale has received lots of attention in the last few years. It has large-scale differences, frequent occlusion, and appearance differences caused by the viewing angle differences, which is quite challenging. In this research, we propose the Tracklet Similarity Network (TSN) for a multi-target multi-camera (MTMC) vehicle tracking system based on the evaluation of the similarity between vehicle tracklets. In addition, a novel component, Candidates Intersection Ratio (CIR), is proposed to refine the similarity. It provides an associate scheme to build the multi-camera tracking results as a tree structure. Based on these components, an end-to-end vehicle tracking system is proposed. The experimental results demonstrate that an 11% improvement on the evaluation score is obtained compared to the conventional similarity baseline. Keywords: vehicle tracking; multiple camera; tracklet similarity; deep learning 1. Introduction With the recent advancement of computer vision, city-scale automatic traffic manage- ment is now possible. Real-time multi-target multi-camera (MTMC) vehicle tracking can be improved by techniques for automatic traffic monitoring and management [16]. Auto- matic video analytics can enhance traffic infrastructure design and congestion handling through the pervasively deployed traffic cameras. Real-time multi-target multi-camera tracking is one of the crucial tasks in traffic management. Its purpose is to achieve better traffic design and traffic flow optimization by tracking many vehicles in a network across multiple surveillance cameras, as shown in Figure 1. Most approaches in MTMC follow the tracking by detection pipeline. Firstly, a detector is adopted to obtain all vehicle detections. After vehicle detection, a single-camera tracker needs to form vehicle tracklets of the same vehicle in each view. Then, these vehicle tracklets are associated across cameras. There are also several difficulties for the MTMC task. The problems of how to eliminate unreliable vehicle tracklets and deal with view variations are significant in these tasks. Large-scale automatic video analytic systems must handle a large variability of vehicle types and appearances to meet the accuracy and reliability requirements in the real world. For applications such as vehicle re-identification, large view variations cast a significant challenge in vehicle re-identification across views. Similarly, how best to perform space– time vehicle tracklet association across views is important for vehicle counting and traffic analysis. In addition, images are captured by different cameras. The vehicle may have different poses and illumination conditions, resulting in different colors of the appearances. Different weather conditions, such as raining or hazing, make vehicle tracking problems more challenging. Existing works [7,8] evaluate the connectivity between tracklets across cameras by simple Euclidean distance and cosine similarity. However, these metrics are not robust enough to measure the connectivity in tracklets. Moreover, when one tracklet is associated Electronics 2022, 11, 1008. https://doi.org/10.3390/electronics11071008 https://www.mdpi.com/journal/electronics