Detecting Prostatic Adenocarcinoma From Digitized Histology Using a Multi-Scale Hierarchical Classification Approach Scott Doyle, Carlos Rodriguez, Anant Madabhushi Dept. of Biomedical Engineering Rutgers University Piscataway, NJ 08854, USA anantm@rci.rutgers.edu John Tomaszeweski, Michael Feldman Dept. of Surgical Pathology University of Pennsylvania Philadelphia, PA 19104, USA feldmanm@mail.med.upenn.edu Abstract— In this paper we present a computer-aided diagno- sis (CAD) system to automatically detect prostatic adenocarci- noma from high resolution digital histopathological slides. This is especially desirable considering the large number of tissue slides that are currently analyzed manually – a laborious and time-consuming task. Our methodology is novel in that texture- based classification is performed using a hierarchical classifier within a multi-scale framework. Pyramidal decomposition is used to reduce an image into its constituent scales. The cascaded image analysis across multiple scales is similar to the manner in which pathologists analyze histopathology. Nearly 600 different image texture features at multiple orientations are extracted at every pixel at each image scale. At each image scale the classifier only analyzes those image pixels that have been determined to be tumor at the preceding lower scale. Results of quantitative evaluation on 20 patient studies indicate (1) an overall accuracy of over 90% and (2) an approximate 8-fold savings in terms of computational time. Both the AdaBoost and Decision Tree classifiers were considered and in both cases tumor detection sensitivity was found to be relatively constant across different scales. Detection specificity was however found to increase at higher scales reflecting the availability of additional discrimi- natory information. Index Terms— Hierarchical classifier, decision trees, Ad- aBoost, prostate cancer, digitized histology. I. INTRODUCTION Prostate cancer is a major problem in the United States, with a predicted 234,000 cases and 27,000 deaths in 2006 according to the American Cancer Society. Patient prognosis is greatly increased if the condition is diagnosed early. The current gold standard for prostate cancer diagnosis is histological analysis of tissue samples obtained via trans- rectal ultrasound (TRUS) biopsy. Current TRUS protocols mandate between 12-20 biopsy samples per patient. The low accuracy of TRUS (20-25%) for elevated prostate specific antigen levels means that pathologists spend several man- hours sieving through mostly benign tissue. The advent of digital high-resolution scanners has made available digitized histological tissue samples that are amenable to computer-aided diagnosis (CAD). CAD can relieve the pathologists’ burden by discriminating obviously benign and malignant tissue so as to reduce the amount of tissue area to be analyzed by a pathologist. While histology- based CAD is relatively recent compared to radiology-based CAD, some researchers have developed CAD methods to analyze prostate histology. Previous CAD work has mostly used color, texture, and wavelet features [1], texture-based second-order features [2], or morphological attributes [3] to distinguish manually defined regions of interest on the image. The choice of scale at which to do the image analysis, how- ever, is typically arbitrary. This ad hoc scale selection runs contrary to the multi-scale approach adopted by pathologists who usually identify suspicious regions at lower resolutions and only use the information at the higher scales (where the high level shape and architectural information is present) to confirm their suspicions (Figure 1). Figure 1 shows an image of digitized prostate histopathology at multiple scales. While low level attributes such as texture and intensity are available at the lower image scales (Figure 1 (a)-(c)) to distinguish benign from cancerous regions, higher level shape and architectural attributes of tissue become apparent only at the higher scales (Figure 1 (d), (e)). In this paper we present a multi-scale approach to detecting prostate cancer from digitized histology. Nearly 600 texture and intensity features are extracted at every image pixel and at every image scale. A hierarchical classification scheme (a variant of the cascade classifier originally proposed by Viola and Jones [4]) at each scale analyzes only those regions that were determined as suspicious in the scale immediately preceding it. Thus without compromising on the sensitivity of cancer detection, the classifier’s detection specificity increases at higher scales. Our hierarchical CAD paradigm is not specific to any particular classifier and similar results are obtained with the Decision Tree [7] and AdaBoost [6] algorithms. The novel aspects of this work are in the following. 1) Nearly 600 texture features at multiple orientations are extracted to build signature vectors to distinguish adenocarcinoma from benign stromal epithelium, 2) A multi-resolution approach is used wherein feature extraction and feature classification are performed at each image scale, which is similar to the manner in which a pathologist analyzes tissue slides, and the 3) Use of a hierarchical classifier (with the AdaBoost [6] and Decision Tree [7] algorithms) to analyze specific regions at each image scale determined as tumor on the immediate preceding scale significantly helps reduce execution time while simultaneously not compromising on accuracy. Proceedings of the 28th IEEE EMBS Annual International Conference New York City, USA, Aug 30-Sept 3, 2006 SaBP1.6 1-4244-0033-3/06/$20.00 ©2006 IEEE. 4759 Authorized licensed use limited to: Rutgers University. Downloaded on November 14, 2008 at 15:52 from IEEE Xplore. Restrictions apply.