Structure Extraction from Texture via Relative Total Variation Li Xu Qiong Yan Yang Xia Jiaya Jia ∗ Department of Computer Science and Engineering The Chinese University of Hong Kong Figure 1: Meaningful structure extraction from textured surfaces. Examples from left to right are graffiti on brick, marble mosaic (ca. 260 AD), crop circles, and graffiti on gate. Abstract It is ubiquitous that meaningful structures are formed by or appear over textured surfaces. Extracting them under the complication of texture patterns, which could be regular, near-regular, or irregular, is very challenging, but of great practical importance. We propose new inherent variation and relative total variation measures, which capture the essential difference of these two types of visual forms, and develop an efficient optimization system to extract main struc- tures. The new variation measures are validated on millions of sam- ple patches. Our approach finds a number of new applications to manipulate, render, and reuse the immense number of “structure with texture” images and drawings that were traditionally difficult to be edited properly. CR Categories: I.4.3 [Image Processing and Computer Vi- sion]: Enhancement—Smoothing; G.1.6 [Numerical Analysis]: Optimization—Nonlinear programming Keywords: texture, structure, smoothing, total variation, relative total variation, inherent variation, prior, regularized optimization Links: DL PDF WEB CODE 1 Introduction Many natural scenes and human-created art pieces contain texture. For instance, graffiti and drawings can be commonly seen on brick ∗ e-mail: {xuli, qyan, yxia, leojia}@cse.cuhk.edu.hk walls, railroad boxcars, and subways; carpets, sweaters, and other fine crafts contain various geometric patterns. In human history, mosaic has long been be an art form to represent detailed scenes of people and animals, and imitate paintings using stone, glass, ceramic, and other materials. When searching in Google Images, millions of such pictures and drawings can be found quickly. A few examples from different sources are shown in Figure 1. They share the similarity that semantically meaningful structures are blended with or formed by texture elements. We call them “structure+texture” images. It is particularly interesting that human visual system is fully capable to understand these pictures without needing to remove textures. In psychology [Arnheim 1956], it is also found that “the overall structural features are the primary data of human perception, not the individual details”. Contrary to this almost effortless process, extract structures by a computer is much more challenging. Tedious manual manipulation is needed in all photo editing software that we used. A few ap- proaches [Meyer 2001; Yin et al. 2005; Aujol et al. 2006] employ a total variation image regularizer in optimization. This frame- work, however, cannot satisfyingly distinguish texture from the main structures because both of them could receive similar penal- ties during optimization. Recent edge-preserving image editing tools [Farbman et al. 2008; Subr et al. 2009; Farbman et al. 2010; Kass and Solomon 2010; Paris et al. 2011; Xu et al. 2011] do not aim to solve the same problem, and, therefore, are not optimal so- lutions. More analysis and comparisons will be provided. We present a simple and yet effective method based on novel local variation measures to accomplish texture removal. We found that with regard to our new relative total variation, which will be elab- orated later in this paper, texture and main structure exhibit com- pletely different properties, making them surprisingly decompos- able. With this finding, we present an optimization framework, in which meaningful content and textural edges are penalized differ- ently. A robust numerical solver is also proposed to decompose the original highly non-convex optimization problem into several linear systems, for which fast and robust solution exists. Note that we do not assume specific regularity or symmetry of the texture patterns, and instead allow for a high level of randomness. Non-uniform and anisotropic texture, thus, can be handled in a unified framework.