Biomedical Image Classiﬁcation with Random Subwindows and Decision Trees Rapha¨ el Mar´ ee, Pierre Geurts, Justus Piater, and Louis Wehenkel GIGA/CBIG, Monteﬁore Institute, University of Li` ege, Belgium Abstract In this paper, we address a problem of biomedical image classiﬁcation that involves the automatic classiﬁcation of x-ray images in 57 predeﬁned classes with large intra-class variability. To achieve that goal, we apply and slightly adapt a recent generic method for image classiﬁcation based on ensemble of decision trees and random subwindows [MGPW05]. We obtain classiﬁcation results close to the state of the art on a publicly available database of 10000 x-ray images. We also provide some clues to interpret the classiﬁcation of each image in terms of subwindow relevance. Image Classiﬁcation ⊲ Goal: Given a set of training images labelled into a ﬁnite number of classes, the goal of an automatic image classiﬁcation method is to build a model that will be able to predict accurately the class of new, unseen images. ⊲ Biomedical applications: Organize large-scale image databases into categories without limitation to a speciﬁc diagnostic study, setup clinical diag- nosis tools, provide high-throughput cell phenotype screen- ing, . . . ⊲ Some solutions: • Pre-processing “feature extraction” step, speciﬁc to the particular problem and application domain • Features used as new input variables for traditional learning algorithms (nearest neighbor or neural network classiﬁers) Random Subwindows and Decision Trees ⊲ Concepts • Extraction of a large number of possibly overlapping, square subwindows of random sizes and at random posi- tions • Pixel-based description with scale normalization • Tree-based machine learning ensemble methods • Successfully applied to household objects, buildings, land- scape themes, handwritten digits, faces, . . . ⊲ Learning stage C1 C2 C3 C1 C1 C1 C1 C1 C2 C2 C2 C2 C2 C3 C3 C3 C3 C3 • Class-balanced extraction of (N w ) subwindows: from each training image of class c, we extract N w /(m * nb c ) subwin- dows where m is the number of classes and nb c the number of training images of class c • Subwindow resizing down to a ﬁxed size (16 × 16 pixels) • Subwindow labeling as its parent image class T2 T1 T3 T4 T5 • Building a subwindow classiﬁcation model using supervised methods • Ensemble of T decision tree methods: Tree Boosting, Extra-Trees [GEW06] Random Subwindows and Decision Trees (continued) ⊲ Prediction stage T2 T1 T3 T4 T5 C1 C2 CM 0 0 0 0 0 0 0 1 4 0 C1 C2 CM 0 0 0 0 0 0 0 0 1 4 C1 C2 CM ? ? ? ? ? ? ? + = 3 5 10 4 6 1 5 2 49 1 C2 • Extraction of N w,test subwindows in test image • Propagation of each subwindow into each tree • Aggregation of tree votes. We assign to the image the ma- jority class among the classes assigned to its subwindows. Dataset: IRMA challenge ⊲ Description • 10000 x-ray images (courtesy of TM Lehmann, RWTH, Aachen, Germany, http://www.irma-project.org) • 57 classes according to the IRMA code: diﬀerent modali- ties, orientations, body parts, and biological systems ⊲ Examples Images from the “coronal, pelvis, musculosceletal” class Images from 7 cranium/cervical spine classes Protocol and Results ⊲ Protocol and parameters • Training set: 9000 images, test set: 1000 images ([iCS05]) • N w = 800000, T = 25, N w,test = 500 ⊲ Misclassiﬁcation error rate Method error rate 1-NN + IDM [KGN04] 12.6% 1-NN + CCF + IDM + Tamura 13.3% Discriminative patches [DKN05] 13.9% Random Subwindows + Tree Boosting 14.0 % MI1 Conﬁdence 14.6% Random Subwindows + Extra-Trees 14.7% Gift 5NN8g 20.6% ... ... Nearest Neighbor, 32 × 32, Euclidian 36.8% ... ... Texture directionality 73.3% ⊲ Computational Eﬃciency • Training algorithm is on the order of TN w log N w • Prediction essentially proportional to TN w,test log N w Interpretability • Well classiﬁed subwindows could bring potentially useful information about that class Conclusion • We applied our generic method [MGPW05] on a speciﬁc biomedical task • We obtained results competitive with state-of-the-art algo- rithms without tedious adaptation. It conﬁrms the poten- tial of the approach for a wide range of applications. • The possibility to extract interpretable information from images has been highlighted References [DKN05] T. Deselaers, D. Keysers, and H. Ney. Discriminative training for object recognition using image patches. In Proc. International Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 157–162, June 2005. [GEW06] P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Accepted for publication in Machine Learning Journal, 2006. [iCS05] S. L. N. in Computer Science, editor. Proc. of Cross Language Evaluation Forum (CLEF), to appear, 2005. [KGN04] D. Keysers, C. Gollan, and H. Ney. Classiﬁcation of medical images using non-linear distortion models. In Bildverarbeitung f¨ ur die Medizin (BVM), pages 366–370, March 2004. [MGPW05] R. Mar´ ee, P. Geurts, J. Piater, and L. Wehenkel. Random subwindows for robust image classiﬁcation. In C. Schmid, S. Soatto, and C. Tomasi, editors, Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2005), volume 1, pages 34–40. IEEE, June 2005.