A hierarchical classiﬁer with Growing Neural Gas clustering Igor T. Podolak, Kamil Bartocha Institute of Computer Science, Faculty of Mathematics and Computer Science, Jagiellonian University  Lojasiewicza 6, Krak´ow, Poland uipodola@theta.uoks.uj.edu.pl Abstract. A novel architecture for a hierarchical classiﬁer (HC) is de- ﬁned. The objective is to combine several weak classiﬁers to form a strong one, but a diﬀerent approach from those known, e.g. AdaBoost, is taken: the training set is split on the basis of previous classiﬁer misclassiﬁcation between output classes. The problem is split into overlapping subprob- lems, each classifying into a diﬀerent set of output classes. This allows for a task size reduction as each sub-problem is smaller in the sense of lower number of output classes, and for higher accuracy. The HC proposes a diﬀerent approach to the boosting approach. The groups of output classes overlap, thus examples from a single class may end up in several subproblems. It is shown, that this approach en- sures that such hierarchical classiﬁer achieves better accuracy. A notion of generalized accuracy is introduced. The sub-problems generation is simple as it is performed with a clustering algorithm operating on classiﬁer outputs. We propose to use the Growing Neural Gas [1] algorithm, because of its good adaptiveness. 1 Introduction A classiﬁer is a model which assigns an example attribute vector to one of pre- deﬁned classes [2,3]. In machine learning several methods are known for training model architectures, among them hierarchical. Training a single architecture to achieve a set accuracy, e.g. a single neural network, can take a long time. On the other hand, it is possible to combine several weak models to obtain a hierarchical model giving good training and generalization rate, i.e. correct classiﬁcation of examples not used in training, e.g. boosting methods like AdaBoost [4,5,6]. The proposed approach of Hierarchical Classiﬁer (HC ) deﬁnes methodology for building a hierarchical classiﬁer automatically. In short, ﬁrst a simple weak classiﬁer is built for the whole problem, then sub-problems are built by grouping together examples from classes that were mistaken frequently. This step is done by means of clustering in the results space, not the input space. Each of the clusters, which may overlap, forms a new sub-problem for which a new weak classiﬁer is built, and the process is repeated recursively until a set accuracy is reached. The approach gives a HC model ﬁrst introduced in [7,8,9].