Binary labelling and decision-level fusion Terry Windeatt * , Reza Ghaderi Centre for Vision, Speech and Signal Processing, School of Electronics Engineering, Information Technology and Mathematics, University of Surrey, Guildford, Surrey GU2 5XH, UK Received 6 June 2000; received in revised form 17 January 2001; accepted 18 January 2001 Abstract Two binary labelling techniques for decision-level fusion are considered for reducing correlation in the context of multiple classi®er systems. First, we describe a method based on error correcting coding that uses binary code words to decompose a multi- class problem into a set of complementary two-class problems. We look at the conditions necessary for reduction of error and introduce a modi®ed version that is less sensitive to code word selection. Second, we describe a partitioning method for two-class problems that transforms each training pattern into a vertex of the binary hypercube. A constructive algorithm for binary-to-binary mappings identi®es a set of inconsistently classi®ed patterns, random subsets of which are used to perturb base classi®er training sets. Experimental results on arti®cial and real data, using a combination of simple neural network classi®ers, demonstrate im- provement in performance for these techniques, the ®rst suitable for k-class problems, k > 2 and the second for k  2. Ó 2001 Elsevier Science B.V. All rights reserved. Keywords: Decision-level fusion; Multiple classi®ers; Partitioning; Error-correcting; Bagging; Boosting 1. Introduction Various fusion strategies and associated architectures have been reported for the identi®cation stage of data fusion systems. In [1], fusion strategies are characterised as data-level, feature-level and decision-level, and it is customary to divide decision-level strategies into soft- level and hard-level. By hard-level, we mean that the combining mechanism operates on single-hypothesis decisions, in contrast with soft-level which implies a measure of con®dence associated with the decision. Designing optimal strategies for decision-level fusion has been of interest to researchers in the ®elds of pattern recognition, machine learning, neural networks and more recently in data mining, knowledge discovery and data fusion. If the strategy is based on learning from a set of training patterns whose category label is known, it is referred to as a supervised learning problem. The classi®cation task is to assign a test pattern, not previ- ously used in training, to one of several possible classes. What makes the classi®cation problem challenging is that learning predictive relationships from data is in general ill-posed [2], which means that certain mathe- matical properties associated with the mapping uniqueness, continuity, existence) may be violated. For a practical application, this implies that a learning ma- chine must be designed to appropriately ®t the training data; otherwise it will not perform well on a separate set of test data. For classi®cation problems, there are many methods for building assumptions into the machine to enable it to generalise well. One approach, which has become popular across many disciplines, is based upon the combination of multiple classi®ers, also referred to as an ensemble, committee or expert fusion. Various combining frameworks have been proposed but most systems use similar processing units as individual base) classi®ers. However, if the results of all the base classi- ®ers are too well correlated there is little gain in com- bining, as discussed in [3]. It is therefore desirable to reduce correlation between classi®ers in order to obtain optimal performance. In practice, it may not be feasible to obtain additional training samples, in which case an alternative is to use existing patterns but arrange it so that each classi®er sees a slightly dierent problem. Some design methods aim to reduce correlation by, for example, using dierent base classi®er parameters, feature sets or training sets [4]. A popular approach is to perturb the training set as in Bagging [5] and Boosting [6]. These methods appear to work well for unstable classi®ers, such as neural networks Information Fusion 2 2001) 103±112 www.elsevier.nl/locate/inus * Corresponding author. Tel.: +44-1483-259286; fax: +44-1483- 259554. E-mail address: t.windeatt@surrey.ac.uk T. Windeatt). 1566-2535/01/$ - see front matter Ó 2001 Elsevier Science B.V. All rights reserved. PII: S 1 5 6 6 - 2 5 3 5  0 1 ) 0 0 0 2 9 - X