A Visual Latent Semantic Approach for Automatic Analysis and Interpretation of Anaplastic Medulloblastoma Virtual Slides Angel Cruz-Roa 1 , Fabio Gonz´ alez 1 , Joseph Galaro 2 , Alexander R. Judkins 3 , David Ellison 4 , Jennifer Baccon 5 , Anant Madabhushi 2 , and Eduardo Romero 1 1 BioIngenium Research Group, Universidad Nacional de Colombia, Bogot´ a, Colombia 2 Rutgers, Department of Biomedical Engineering, Piscataway, NJ, USA 3 Children Hospital of L.A., Department of Pathology Lab Medicine, Los Angeles, CA, USA 4 St. Jude Children’s Research Hospital from Memphis, TN, USA 5 Penn State College of Medicine, Department of Pathology, Hershey, PA, USA Abstract. A method for automatic analysis and interpretation of histopathol- ogy images is presented. The method uses a representation of the image data set based on bag of features histograms built from visual dictionary of Haar- based patches and a novel visual latent semantic strategy for characterizing the visual content of a set of images. One important contribution of the method is the provision of an interpretability layer, which is able to explain a particular classification by visually mapping the most important visual patterns associated with such classification. The method was evaluated on a challenging problem involving automated discrimination of medulloblastoma tumors based on image derived attributes from whole slide images as anaplastic or non-anaplastic. The data set comprised 10 labeled histopathological patient studies, 5 for anaplastic and 5 for non-anaplastic, where 750 square images cropped randomly from can- cerous region from whole slide per study. The experimental results show that the new method is competitive in terms of classification accuracy achieving 0.87 in average. 1 Introduction This paper presents a new method, ViSAI, for automatic analysis and interpretation of histopathological images. The method comprises three main stages: learning of an im- age representation based on bag of features (BOF), characterization of the rich visual variety of a histopathological image collection using visual latent topic analysis, and connection of visual patterns with the semantics of the problem using a probabilistic classification model. The learnt probabilistic model is applied to new images, and the class posterior probability is used to determine the corresponding class. The method is applied to the classification of a type of brain cancer called medulloblastoma, which is one of the most common types of malignant brain tumors [10]. In adults, the disease is rare whereas in children the incidence amounts to a 25% of all pediatric brain tumors. Tumor classification of medulloblastoma is currently performed by microscopical ex- amination and no quantitative image analysis and classification tools are so far available for this task. Different histologic types of medulloblastoma have different prognosis. N. Ayache et al. (Eds.): MICCAI 2012, Part I, LNCS 7510, pp. 157–164, 2012. c Springer-Verlag Berlin Heidelberg 2012