BENEFIT MAXIMIZATION IN CLASSIFICATION ON FEATURE PROJECTIONS H. Altay Güvenir Bilkent University Computer Engineering Department, Ankara Turkey Abstract In some domains, the cost of a wrong classification may be different for all pairs of predicted and actual classes. Also the benefit of a correct prediction is different for each class. In this paper, a new classification algorithm, called BCFP (for Benefit Maximizing Classifier on Feature Projections), is presented. The BCFP classifier learns a set of classification rules that will predict the class of a new instance with maximum benefit or minimum cost. BCFP represents a concept in the form of feature projections on each feature dimension separately. Classification in the BCFP algorithm is based on a voting among the individual predictions made on each feature. A genetic algorithm is used to select the relevant features. The performance of the BCFP algorithm is evaluated in terms of accuracy. As a case study, the BCFP algorithm is applied to the problem of diagnosis of gastric carcinoma. A lesion can be an indicator of one of nine different levels of gastric carcinoma. The benefit of correct classification of early levels is much more than that of late cases. Also, the cost of wrong classifications is different for all classes. Key Words Machine learning, feature projection, voting, benefit maximization 1. Introduction Classical classification algorithms aim to maximize the number of correct classifications, or, in other words, minimize the number of incorrect classifications. However, in some domains, the cost of a wrong classification is different for each predicted/actual class pair. Also the benefit of correct prediction is different for each class. In this paper we propose an inductive classification learning algorithm, called Benefit Maximizing Classifier on Feature Projections (BCFP). BCFP is based on a knowledge representation technique, called feature projections, which has been successfully employed in CFP [1]. As a case study, we show its application to a medical dataset to diagnose the gastric tumors. The input to the BCFP training algorithm is a set of training instances. Learning from the training examples, BCFP constructs a representation of the classification knowledge inherent in these examples. This knowledge is represented as the projections of the training dataset as feature intervals on each feature dimension separately. For each feature dimension, projection points with similar characteristics are grouped into intervals. Therefore, an interval is a generalization that represents a set of feature values that yield the same classifications. Classification in the BCFP algorithm is based on a voting mechanism among the individual predictions made on each feature. Since each feature participates independently of the others, both in learning and classification, BCFP enables an easy and natural way of handling missing feature values by simply ignoring them. Other machine learning algorithms using feature projection based knowledge representation were successfully applied to medical domains. For example, an expert system named DES was implemented for differential diagnosis of erythemato-squamous diseases in dermatology [2] based on the VFI (Voting Feature Intervals) technique [3]. These classification systems, however, are not designed for cost-sensitive classification domains. Therefore they do not work on domains, where the benefit of correct classification is different for each class; also the cost of wrong classification is different for all pairs of predicted and actual classes. The next section presents the BCFP algorithm. Section 3 describes the gastric carcinoma domain, and presents the results of the application of the BCFP algorithm to the gastric carcinoma domain. Also the BCFP algorithm is compared with the performance of the medical students specializing on gastroenterology. Finally, the last section concludes with some remarks and suggestions for feature work. 2. The BCFP algorithm The BCFP algorithm is the classification cost sensitive version of the feature projection based classification algorithms family [1]. In the following subsections, the