Classification by Reflective Convex Hulls
Mineichi Kudo Atsuyoshi Nakamura
Division of Computer Science
Graduate School of Information Sci. and Tech.
Hokkaido University, Sapporo 060-0814, JAPAN
E-mail: {mine,atsu}@main.ist.hokudai.ac.jp
Ichigaku Takigawa
Institute for Chemical Research
Bioinformatics Center
Kyoto University
E-mail: takigawa@kuicr.kyoto-u.ac.jp
Abstract
A set of convex bodies including samples of a single
class only is used for classification. The convex body is
defined by some facets (hyper-planes) that separate the
class from the other classes. This paper describes an
algorithm to find a set of such convex bodies efficiently
and examine the performance of a classifier using them.
The relationship to the support vector machines is also
discussed.
1. Introduction
The convex hull conv(S) of a finite set S in m-
dimensional Euclidean space is one of central concepts
in computational geometry. In pattern recognition, the
convex hulls which cover all the training samples of
one class allows us to measure the separability among
classes. Indeed, the relationship between those convex
hulls and support vector machines (SVMs) have been
well studied [2, 5, 8]. A typical view of such trials is
that the hyper-plane of an SVM is identical to the bi-
sector hyper-plane between the closest points of convex
hulls of two classes [8].
When we use convex hulls for classification, the fol-
lowing problems arise: 1) The convex hull of a finite set
is hard to be constructed in high dimensions, 2) It costs
much to calculate the distance between a point and the
convex hull, and 3) In general, we need more than one
convex hull for approximating a class region. For 1),
there is no efficient algorithm to find the convex hull
explicitly in high dimensions. Indeed, the number of
facets is often exponential in m. For 2), the problem to
calculate the distance D(x, conv(S)) for x ∈ conv(S)
is known to be NP-hard in the representation size of
conv(S) [7]. For 3), we need more than one convex
hull to exclude samples from the classes other than a
target class. The authors have already proposed such an
approach using quasi convex hulls with restricted an-
gles [6].
In this paper, to cope with these three problems, we
use several randomize techniques. We would obtain ef-
ficiency at the expense of loosing the perfection to some
extent.
2. Convex Hulls and Support Functions
The simplest definition of the convex hull conv(S)
of a given dataset S, is the intersection of all convex
sets containing S. For a finite set S, C = conv(S) is
a polyhedron with at most |S| vertices. Such a polyhe-
dron can be defined in several ways. By ∂C, we de-
note the boundary of C and divide it into q-faces ac-
cording to the dimensions. For example, 0-faces are
the vertices of C and (m - 1)-faces are the facets
or hyper-planes. Let V (C) be the set of vertices of
C and F (C) be the set of facets of C. The second
definition is called V -representation and is defined as
C = {y =
∑
c
x
x|
∑
c
x
=1,c
x
≥ 0,x ∈ V (C)}.
The third one is called H-representation and is defined
as C = {y|〈w, y〉≤ c, ∀ (w, c) ∈ F (C)}, where 〈·, ·〉
is the inner product and a facet (w, c) is specified by a
normal vector w (||w|| = 1) and a constant c ∈ R.
In this paper, as the fourth definition, we use support
functions to express a convex hull C. A support func-
tion with a unit vector w (||w|| = 1) is given by
H(S, w) = sup{〈x, w〉| x ∈ S},
where sup denotes the supremum. With all possible di-
rections w, we can specify C as
C =
w:||w||=1
{x|〈x, w〉≤ H(S, w)}.
Of course, it is sufficient to use w of (w, c) ∈ F (C)
instead of all possible w’s. To enhance the role we call
a plane h(S, w)= {x|〈x, w〉 = H(S, w)} a support
978-1-4244-2175-6/08/$25.00 ©2008 IEEE