Improving Classifier Fusion Using Particle Swarm Optimization Kalyan Veeramachaneni Dept. of EECS Syracuse University Syracuse, NY, U. S. A kveerama@syr.edu Weizhong Yan GE Global Research Center Niskayuna, NY, U. S. A yan@crd.ge.com Kai Goebel NASA Ames Research Center Moffett Field, CA, U. S. A goebel@email.arc.nasa.com Lisa Osadciw Dept of EECS Syracuse University Syracuse, NY, U. S. A laosadci@syr.edu Abstract - Both experimental and theoretical studies have proved that classifier fusion can be effective in improving overall classification performance. Classifier fusion can be performed on either score (raw classifier outputs) level or decision level. While tremendous research interests have been on score-level fusion, research work for decision-level fusion is sparse. This paper presents a particle swarm optimization based decision-level fusion scheme for optimizing classifier fusion performance. Multiple classifiers are fused at the decision level, and the particle swarm optimization algorithm finds optimal decision threshold for each classifier and the optimal fusion rule. Specifically, we present an optimal fusion strategy for fusing multiple classifiers to satisfy accuracy performance requirements, as applied to a real-world classification problem. The optimal decision fusion technique is found to perform significantly better than the conventional classifier fusion methods, i.e., traditional decision level fusion and averaged sum rule. Keywords: Decision level fusion, multiple classifiers fusion, particle swarm optimization. 1 Introduction Classifier design is a task of developing a classification system that optimizes performance with respect to requirements. Traditionally, design of classification systems is to empirically choose a single classifier through experimental evaluation of a number of different ones. The parameters of the selected classifiers are then optimized so that the specified performance is met. Single classifier systems have limited performance. For certain real-world classification problems, this single classifier design approach may fail to meet the desired performance even after all parameters/architectures of the classifier have been fully optimized. In these cases, classifier fusion , one of the most significant advances in pattern classification in recent years, proves to be effective and efficient [2]. By taking advantage of complementary information provided by the constituent classifiers, classifier fusion offers improved performance, (i.e., they are more accurate than the best individual classifier). Classifier fusion can be done at two different levels, namely, score level and decision level. In score level fusion, raw outputs (scores or confidence levels) of the individual classifiers are combined in a certain way to reach a global decision. The combination can be performed either simply using the sum rule or averaged sum rule, or more sophisticatedly, using another classifier. Decision level fusion, on the other hand, arrives at the final classification decision by combining the decisions of individual classifiers. Majority voting rule and Chair- Varshney [13] optimal fusion rule are two examples of decision-level fusion schemes. Chair- Varshney [13] optimal decision fusion rule is achieved using the individual classifier performance indices. The optimal fusion rule can be majority-voting rule but is not limited to it. There have been very few studies in optimizing fusion system performance. At each level of fusion, alternate strategies of fusion exist which can be explored to achieve the optimal performance across different costs of miss classification. In this paper, decision level fusion is chosen and optimization of decision level fusion to achieve the required performance is presented. In decision level fusion, shown in Figure 1, each classifier under binary hypothesis gives its decision regarding the class of the observation. The decisions from multiple classifiers are fused at the fusion processor. The fusion processor uses a fusion rule to fuse the multiple decisions and produces a decision. The most important problem for achieving optimum performance at decision level fusion becomes the optimal setting of individual decision thresholds. There are 2 2^N possible fusion rules for a binary hypothesis and N classifier system. Most of the classifier fusion work done in the past neglects all the possible rules that can be explored at decision level. Also the decision threshold for individual classifier is optimally set to minimize the error of the classifier [2]. This is done even before the fusion is carried out. This typically entails selection of an operating point from the Receiver Operating Characteristic (ROC) curve for the individual classifier, which will minimize the error for given costs of misclassification. Once the decision thresholds for individual classifiers are set, majority voting rule or the chair-varshney optimal fusion rule is used as the fusion rule. This method, however, does not guarantee optimum performance after fusion. Performance can be defined under Neymen Pearson criterion or Bayesian criterion. In this paper the optimal thresholds and the corresponding fusion rule which results in optimum 128 Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Multicriteria Decision Making (MCDM 2007) 1-4244-0702-8/07/$20.00 ©2007 IEEE Authorized licensed use limited to: Syracuse University Library. Downloaded on June 23, 2009 at 09:00 from IEEE Xplore. Restrictions apply.