What Is Optimized in Tight Convex Relaxations for Multi-Label Problems? Christopher Zach Microsoft Research Cambridge, UK chzach@microsoft.com Christian H¨ ane ETH Z ¨ urich, Switzerland chaene@inf.ethz.ch Marc Pollefeys ETH Z ¨ urich, Switzerland marc.pollefeys@inf.ethz.ch Abstract In this work we present a uniﬁed view on Markov random ﬁelds and recently proposed continuous tight convex relax- ations for multi-label assignment in the image plane. These relaxations are far less biased towards the grid geometry than Markov random ﬁelds. It turns out that the continu- ous methods are non-linear extensions of the local polytope MRF relaxation. In view of this result a better understand- ing of these tight convex relaxations in the discrete setting is obtained. Further, a wider range of optimization methods is now applicable to ﬁnd a minimizer of the tight formula- tion. We propose two methods to improve the efﬁciency of minimization. One uses a weaker, but more efﬁcient con- tinuously inspired approach as initialization and gradually reﬁnes the energy where it is necessary. The other one refor- mulates the dual energy enabling smooth approximations to be used for efﬁcient optimization. We demonstrate the utility of our proposed minimization schemes in numerical experi- ments. 1. Introduction Assigning labels to image regions e.g. in order to obtain a semantic segmentation, is one of the major tasks in com- puter vision. The most prominent approach to solve this problem is to formulate label assignment as Markov ran- dom ﬁeld (MRF) incorporating local label preference and neighborhood smoothness. Since in general label assign- ment is NP-hard, ﬁnding the true solution is intractable and approximate ones are determined. One promising approach to solve MRF instances is to relax the intrinsically difﬁcult constraints to convex outer bounds. There are currently two somewhat distinct lines of research utilizing such convex re- laxations: the direction, that is mostly used in the machine learning community, is based on a graph representation of image grids and uses variations of dual block-coordinate methods [10, 9, 16, 15] (usually referred as message pass- ing algorithms in the literature). The other set of methods is derived from the analysis of partitioning an image in the continuous setting, i.e. variations of the Mumford-Shah seg- mentation model [12, 1]. Using the principle of biconjuga- tion to obtain tight convex envelopes, [5] obtains a convex relaxation of multi-label problems with generic (but met- ric) transition costs in the continuous setting. Subsequent discretization of this model to ﬁnite grids yields to strong results in practice, but it was not fully understood what is optimized in the discrete setting. In this work we close the gap between convex formu- lations for MRFs and continuous approaches by identify- ing the latter methods as non-linear (but still convex) ex- tensions of the standard LP relaxation of Markov random ﬁelds. This insight has several implications: (a) it becomes clearer why the model proposed in [5] is tighter than other relaxations proposed for similar labeling problems, and (b) a wider range of optimization methods becomes applicable, especially after obtaining equivalent convex programs uti- lizing redundant constraints. Thus, the results obtained in this work are of theoretical and practical interest. 2. Background In the following section we summarize the necessary background on discrete and continuous relaxations of multi- label problems. 2.1. Notations In this section we introduce some notation used in the following. For a convex set C we will use ı C to denote the corresponding indicator function. i.e. ı C (x)=0 for x ∈ C and ∞ otherwise. We use short-hand notations [x] + and [x] − for max{0,x} and min{0,x}, respectively. Finally, for an extended real-valued function f : R n → R∪{∞} we denote its convex conjugate by f ∗ (y) = max x x T y − f (x). 2.2. Label Assignment, the Marginal Polytope and its LP Relaxation In the following we will consider only labeling problems with unary and pairwise interactions between nodes. Let V be a set of V = |V| nodes and E be a set of edges connecting nodes from V . The goal of inference is to assign labels Λ: 1