Geometric & Topological Representations of Maximum Classes with Applications to Sample Compression Benjamin I. P. Rubinstein 1 and J. Hyam Rubinstein 2 1 Computer Science Division, University of California, Berkeley, U.S.A. 2 Department of Mathematics & Statistics, the University of Melbourne, Australia 1 benr@cs.berkeley.edu, 2 rubin@ms.unimelb.edu.au Abstract We systematically investigate finite maximum classes, which play an important role in machine learning as concept classes meeting Sauer’s Lemma with equality. Simple arrangements of hy- perplanes in Hyperbolic space are shown to rep- resent maximum classes, generalizing the corre- sponding Euclidean result. We show that sweep- ing a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a con- jecture of Kuzmin & Warmuth. A bijection be- tween maximum classes and certain arrangements of Piecewise-Linear (PL) hyperplanes in either a ball or Euclidean space is established. Finally, we show that d-maximum classes corresponding to PL hyperplane arrangements in R d have cubical com- plexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. 1 Introduction Maximum concept classes have the largest cardinality possi- ble for their given VC dimension. Such classes are of partic- ular interest as their special recursive structure underlies all general sample compression schemes known to-date [Flo89, War03, KW07]. It is this structure that admits many elegant geometric and algebraic topological representations upon which this paper focuses. Littlestone & Warmuth [LW86] introduced the study of sample compression schemes, defined as a pair of mappings for given concept class C:a compression function mapping a C-labeled n-sample to a subsequence of labeled exam- ples and a reconstruction function mapping the subsequence to a concept consistent with the entire n-sample. A com- pression scheme of bounded size—the maximum cardinal- ity of the subsequence image—was shown to imply learn- ability [LW86]. The converse—that classes of VC dimen- sion d admit compression schemes of size d—has become one of the oldest unsolved problems actively pursued within learning theory. Recently Kuzmin and Warmuth achieved compression of maximum classes without the use of labels [KW07]. They also conjectured that their elegant Min- Peeling Algorithm constitutes such an unlabeled d- compression scheme for d-maximum classes. As in our previous work [RBR08], maximum classes can be fruitfully viewed as cubical complexes. These are also topological spaces, with each cube equipped with a natu- ral topology of open sets from its standard embedding into Euclidean space. We proved that d-maximum classes corre- spond to d-contractible complexes—topological spaces with an identity map homotopic to a constant map—extending the result that 1-maximum classes have trees for one-inclusion graphs. Peeling can be viewed as a special form of con- tractibility for maximum classes. However, there are many non-maximum contractible cubical complexes that cannot be peeled, which demonstrates that peelability reflects more detailed structure of maximum classes than given by con- tractibility alone. In this paper we approach peeling from the direction of simple hyperplane arrangement representations of maximum classes. Kuzmin & Warmuth predicted that d-maximum classes corresponding to simple linear hyperplane arrange- ments could be unlabeled d-compressed by sweeping a gen- eric hyperplane across the arrangement, and that concepts are min-peeled as their corresponding cell is swept away [KW07, Conjecture 1]. We positively resolve the first part of the conjecture and show that sweeping such arrangements cor- responds to a new form of corner-peeling, which we prove is distinct from min-peeling. While min-peeling removes min- imum degree concepts from a one-inclusion graph, corner- peeling peels vertices that are contained in unique cubes of maximum dimension. We explore simple hyperplane arrangements in Hyper- bolic geometry, which we show correspond to a set of maxi- mum classes, properly containing those represented by sim- ple linear Euclidean arrangements. These classes can again be corner-peeled by sweeping. Citing the proof of existence of maximum unlabeled compression schemes presented in [BDL98], Kuzmin & Warmuth ask whether unlabeled com- pression schemes for infinite classes such as positive half spaces can be constructed explicitly [KW07]. We present constructions for illustrative but simpler classes, suggesting that there are many interesting infinite maximum classes ad- mitting explicit compression schemes, and under appropri- ate conditions, sweeping infinite Euclidean and Hyperbolic arrangements corresponds to compression by corner-peeling. Next we prove that all maximum classes in {0, 1} n are represented as simple arrangements of Piecewise-Linear (PL)