A Near Optimal Coder For Image Geometry With Adaptive Partitioning Arian Maleki Department of Electrical Engineering Stanford University arianm@stanford.edu Morteza Shahram Department of Statistics Stanford University mshahram@stanford.edu Gunnar Carlsson Department of Mathematics Stanford University gunnar@math.stanford.edu Abstract—In this paper, we present a new framework to compress the geometry of images. This framework generalizes the standard quad partitioning approaches in compression of image geometry (e.g. wedgelet) in two ways. First, we employ an adap- tive rectangular partitioning rather than quadratic partitioning. Second, our coder uses an overcomplete collection of (stripe-like) atoms which contains wedgelets as a special case. We present an information-theoretical analysis based on Kolmogorov’s ǫ- entropy to show that this collection provides a near-optimal repre- sentation of a class of cartoon images with piecewise polynomial boundaries. Furthermore, we develop a provably near-optimal greedy algorithm that signiﬁcantly reduces the complexity of the exhaustive search method required to achieve the entropy bound. Simulation results for the rate distortion shows a 1.5-2 dB improvement over the standard wedgelets for the ”Cameraman” image. Index Terms—Image coding, entropy, piecewise polynomial, quadtrees, image representations. I. I NTRODUCTION Natural images are composed of two components: cartoon (geometry) and texture. Cartoons are conveniently described by geometric structures such as ﬂat objects with piecewise smooth boundaries. It is known that wavelets do not provide sparse representation for these structures [1]. Several works in the past have proposed frameworks to achieve more efﬁcient representations [2], [3], [4], [5] of geometrical features such as lines and curves. Among these frameworks, wedgelets in particular have shown to be a successful base for practical compression schemes [6], [7]. The basic idea of wedgelets is to partition an image into dyadic squares and approximate each square with a wedge-like patch. However, wedgelets have emerged from optimal representations of a very simple class of images (bi-color patches with smooth boundary)[2], [8]. The work by Shukla et al. [9] implies that the wedgelets system loses coding efﬁciency (due to overpartitioning) for regions with slightly more complicated geometries which are frequent in natural images. Furthermore, there have been several works on learning a dictionary by which image patches can be represented very sparsely [10], [11]. In these approaches, it is assumed that there exists a dictionary of 2-D functions such that any image patch can be approximated as a sparse linear combination of the elements of the dictionary. The elements of such dictionary are learned by an iterative optimization scheme. The majority of learned elements resemble stripe-like patches rather than wedgelets. In this paper, we introduce a new class of 2-D functions to deal with more general geometries than those of wedgelets. We study the performance of an optimal encoder for this class of functions by computing Kolmogorov’s ǫ-entropy. Kol- mogorov’s ǫ-entropy [12] encodes a compact set of functions such that the distortion of all signals is less than ǫ and counts the number of the required codewords. Shannon’s rate- distortion theory [13] also provides a framework to study compression. It codes the stochastic sources by discarding the functions with small probabilities and compressing the rest. Since we do not assume any probabilistic model for our class of functions, we use Kolmogorov’s ǫ-entropy. In order to reach the ǫ-entropy we expand the dictionary of wedgelets with stripe-like patches (which we call bi- wedgelets) and also generalize the partitioning algorithm. Using exhaustive search for this algorithm is computationally infeasible. We propose a greedy partitioning and greedy atom selection methodology to signiﬁcantly speed up the encoder. The organization of this paper is as follows. In the next section, we introduce a class of functions which will be our model for image geometries throughout this paper. We also compute the ǫ-entropy of this class. In Section III, a practical near-optimal coder will be proposed. In Section IV, we present rate distortion results for different algorithms. Finally, we conclude the paper in Section V. II. NOTATIONS AND FRAMEWORK Let the space of piecewise polynomial functions on an interval I x ⊂ [0, 1] be represented by BP Q N (I x ,A) where N is the maximum degree of the polynomials on the interval, Q is the maximum number of singularities 1 , and the functions are bounded by A, i.e. sup t∈Ix |f (t)|≤ A. Based on the class of BP Q N (I x ,A), we construct a class of 2-D functions, T Q N (I x ), whose elements are deﬁned by f (x, y)= α d l 1 {y≤h d (x)} +α m l 1 {h d (x)<y≤hu(x)} +α u l 1 {y>hu(x)} , (1) where h u (x),h d (x) ∈ BP Q N (I x , 1), 0 ≤ h d (x) ≤ h u (x) ,l 1 A is the indicator function of set A and 0 ≤ α d ,α m ,α u ≤ 1. 1 A singularity is deﬁned as a point at which the function is not inﬁnitely differentiable.