A new notion of weakness in classiﬁcation theory Igor T. Podolak, Adam Roman Institute of Computer Science, Faculty of Mathematics and Computer Science, Jagiellonian University  Lojasiewicza 6, Krak´ow, Poland uipodola@theta.uoks.uj.edu.pl, roman@ii.uj.edu.pl Summary. The notion of a weak classiﬁer, as one which is “a little better” than a random one, was introduced ﬁrst for 2-class problems [1]. The extensions to K-class problems are known. All are based on relative activations for correct and incorrect classes and do not take into account the ﬁnal choice of the answer. A new under- standing and deﬁnition is proposed here. It takes into account only the ﬁnal choice of classiﬁcation that must be taken. It is shown that for a K class classiﬁer to be called “weak”, it needs to achieve lower than 1/K risk value. This approach considers only the probability of the ﬁnal answer choice, not the actual activations. 1 Introduction The classiﬁcation for a given problem may be solved using various machine learning approaches. One of the most eﬀective, is when several simple classi- ﬁers are trained, and their responses combined [2, 3, 4, 5]. It is possible that the training of subsequent classiﬁers depends on how well the already trained classiﬁers perform. This is the background of the boosting approach introduced by Schapire and further developed in [6, 7, 8]. The boosting algorithms train a sequel h 1 ,h 2 ,... of simple classiﬁers, each performing slightly better than a random one. After training h i , a boosting algorithm, e.g. AdaBoost, changes the probability distribution function with which examples are chosen for further training putting more attention to in- correctly recognized examples. The ﬁnal classiﬁer is a weighted sum of all individual classiﬁers h i . The algorithm is well deﬁned for a 2-class problem with an extension to K-class (K> 2) through a so called pseudo-loss func- tion [7]. Using the pseudo-loss approach, a classiﬁer is deﬁned to be weak if the expected activation for the true class is higher then the mean activation for incorrect classes. The authors have worked on a so-called Hierarchical Classiﬁer (HC) [9]. The HC is built of several simple classiﬁers which divide the input space