Weighted Strategy for Error-Correcting Output Codes Sergio Escalera, Oriol Pujol, and Petia Radeva Computer Vision Center, Universitat Aut` onoma de Barcelona, Campus UAB, Edifici O, 08193, Bellaterra, Spain. Dept. Matem` atica Aplicada i An ` alisi, Universitat de Barcelona, Gran Via 585, 08007, Barcelona, Spain. {sergio,oriol,petia}@maia.ub.es Abstract Error Correcting Output Codes technique (ECOC) represents a general framework capable to extend any binary classification process to the multi-class case. In this work, we present a novel decoding strategy that takes advantage of the ECOC cod- ing to outperform the up to now existing decod- ing strategies. The results show that the presented methodology considerably increases the perfor- mance of the state-of-the-art ECOC designs. Keywords: Ensemble Methods and Boosting, Learning, Classification. 1 Introduction Multi-class categorization in a Machine Learning is based on assigning labels to instances that be- long to a finite set of object classes N (N> 2). Nevertheless, designing a multi-classification tech- nique is a difficult task. In this sense, it is com- mon to conceive algorithms that distinguish be- tween two classes and combine them following a special criterion. Pairwise (one-versus-one) vot- ing scheme [6] or one-versus-all [8] grouping strat- egy are the procedures most frequently used. Error Correcting Output Codes were born as a frame- work for handling multi-class problems using bi- nary classifiers [3]. ECOC has shown to dramati- cally improve the classification accuracy of super- vised learning algorithms in the multi-class case by reducing the variance of the learning algorithm and correcting errors caused by the bias of the learners [4]. Furthermore, ECOC has been successfully ap- plied to a wide range of applications, such as face recognition, text recognition or manuscript digit classification. The ECOC framework consists of two steps: a coding step, where a codeword 1 is assigned to each class, and a decoding technique, where given a test sample the method looks for the most similar class codeword. One of the first designed binary cod- ing strategies is the one-versus-all approach, where each class is discriminated against the rest. How- ever, it was not until Allwein et al. [1] introduced a third symbol (the zero symbol) in the coding process that the coding step received special at- tention. The ternary ECOC gives more expressiv- ity to the ternary ECOC framework by allowing some classes to be ignored by the binary classi- fiers. Thanks to this, strategies such as one-versus- one [6] and random sparse coding [1] are pos- sible. However, these predefined codes are inde- pendent of the problem domain, and recently, new approaches involving heuristics for the design of problem-dependent output codes have been pro- posed [10][5] with successful results. The decoding step was originally based on error- correcting principles under the assumption that the learning task can be modelled as a communication problem, in which class information is transmitted over a channel [3]. In this sense, the Hamming and the Euclidean distances were the first tenta- tive for decoding [3]. Still very few alternative de- 1 The codeword is a sequence of bits (called code) repre- senting each class, where each bit identifies the class mem- bership by a given binary classifier.