Relaxed Pairwise Learned Metric for Person Re-Identiﬁcation Martin Hirzer, Peter M. Roth, Martin K¨ ostinger, and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology Abstract. Matching persons across non-overlapping cameras is a rather challenging task. Thus, successful methods often build on complex fea- ture representations or sophisticated learners. A recent trend to tackle this problem is to use metric learning to ﬁnd a suitable space for match- ing samples from diﬀerent cameras. However, most of these approaches ignore the transition from one camera to the other. In this paper, we propose to learn a metric from pairs of samples from diﬀerent cameras. In this way, even less sophisticated features describing color and texture information are suﬃcient for ﬁnally getting state-of-the-art classiﬁcation results. Moreover, once the metric has been learned, only linear pro- jections are necessary at search time, where a simple nearest neighbor classiﬁcation is performed. The approach is demonstrated on three pub- licly available datasets of diﬀerent complexity, where it can be seen that state-of-the-art results can be obtained at much lower computational costs. 1 Introduction Person re-identiﬁcation, i.e., recognizing an individual across spatially disjoint cameras, is becoming one of the major challenges in visual surveillance. Typical applications include but are not limited to tracking criminals, analyzing crowd movements in public places, and ﬁnding children who lost their parents. Since the number of public areas that become subject to video surveillance is ever growing, eﬃcient, automatic systems are required to reduce the load on human operators. In general, person re-identiﬁcation is very challenging for several reasons. First, the appearance of an individual can vary extremely across a network of cameras due to changing view points, illumination, diﬀerent poses, etc. Second, there is a potentially high number of “similar” persons (e.g., people wear rather dark clothes in winter). Third, in contrast to similar large scale search problems typically no accurate temporal and spatial constraints can be exploited to ease the task. Thus, motivated by the high number of practical applications and still unresolved diﬃculties there has been an increased scientiﬁc interest (e.g., [1– 10]) in recent years, and also various benchmark datasets [11, 8, 6] have been published.