Active Learning with Ensembles for Image Classification H. Liu and A. Mandvikar 1 and P. Foschi 2 and K. Torkkola 3 1 Department of Computer Science & Engineering, PO Box 875406, Arizona State University, Tempe, AZ 85287. {huanliu,amitm}@asu.edu 2 Romberg Tiburon Center, 3152 Paradise Drive, Tiburon, CA 94920. tfoschi@sfsu.edu 3 Human Interface Lab, Motorola Labs,Tempe, AZ 85284. kari.torkkola@motorola.com Abstract In many real-world tasks of image classification, limited amounts of labeled data are available to train automatic classifiers. Consequently, extensive human expert involvement is required for verifica- tion. A novel solution is presented that makes use of active learning combined with an ensemble of classifiers for each class. The result is a significant reduction in required expert involvement for uncer- tain image region classification. 1 Introduction Multimedia contents are rapidly becoming a major target for data mining research. This work is concerned with image mining, discovering patterns and knowledge from images for the purpose of classifying images. The specific problem we address is image region classification. Egeria Densa is an exotic submerged aquatic weed causing navigation and reservoir-pumping problems in the Sacramento-San Joaquin Delta of Northern California. As a part of a control program to manage Egeria, classification of regions in aerial images is required. This problem can be abstracted to one of clas- sifying massive data without class labels. Relying on hu- man experts for class labeling is not only time-consuming and costly, but also unreliable if the experts are overburdened with numerous minute and routine tasks. Massive manual classification becomes impractical when images are complex with many different objects (e.g., water, land, Egeria) under varying picture-taking conditions (e.g., deep water, sun glint). The main objective of the project is to relieve experts from go- ing through all the images and pointing out locations where Egeria exists in the image. We aim to automate the process via active learning to limit expert involvement to decisions about which the automatic classifier is uncertain. The contributions of this work are a novel concept of class- specific ensembles, and learning class-specific ensembles. We notice that different types of classifiers are better suited to de- tecting different objects such as Egeria, land, water. Since it is impractical to train one classifier for each object (as experts need to provide training instances for all objects), we propose a novel approach to class-specific ensembles. We also discuss how to combine individual classifiers to learn class-specific ensembles. We show that this approach significantly reduces the number of uncertain image regions and is better than a single ensemble for the task of object detection. 2 Class-Specific Ensembles We may tend to use as many classifiers in an ensemble as possible because (1) each classification algorithm may have a different view of the training image and capture varied as- pects of the image, as different classification algorithms have different biases and assumptions; and (2) no single classifier can cover all aspects. In other words, some algorithms may succeed in capturing some latent information about the do- main, while others may fail. However, problems can result from using too many classification algorithms: e.g., (1) us- ing more classification algorithms can result in longer overall training time, especially so if some of the algorithms are time- consuming to train; and (2) some algorithms may be prone to overfitting. If these algorithms are included in the ensemble, there may be a high risk of allowing the ensemble to over- fit some training image(s). The above analysis suggests the necessity of searching for a relevant set of classifiers to form ensembles. Exhaustive search for the best combination is im- practical because the search space is exponential in the total number of classification algorithms for consideration. Thus we need an efficient methodology to find the optimal combi- nation of classifiers for the class-specific ensembles. We adopt three performance criteria for this work: Preci- sion, Recall, and number of uncertain regions (UC). The first two standard measures are defined in terms of the instances that are relevant and the instances that are classified (or re- trieved). The third one is the number of instances (regions in an image) the class-specific ensembles cannot be certain about their class predictions. High precision or high recall alone is not a good performance measure as each describes only one aspect of classification. Together they provide a good measure: for example, the product of precision and recall (PR). If both values are 1, their product is 1, which means all and only positive instances are classified as posi- tive. Hence, PR is a measure for both generality and accuracy. F measure can be used for the same purpose. 3 Learning Class-Specific Ensembles In order to learn class-specific ensembles, we need a data set that links classifiers to the label of each image region. This