Identiﬁcation of disease in CT of the lung using texture-based image analysis John Malone, Jonathan M. Rossiter Department of Engineering Maths University of Bristol Bristol, BS8 1TR, UK Email: J.P.Malone@bris.ac.uk Sanjay Prabhu, Paul Goddard Department of Clinical Radiology Bristol Royal Inﬁrmary Bristol, BS2 8HW, UK Abstract— Here we aim to evaluate the pulmonary parenchyma from CT scans of the thorax using textural analysis. For each of 34 patients, 3 axial slices were chosen. We split each of the 102 images into grids with block sizes of 4, 8 and 16 pixels and calculated 18 textural features for each block. Using these features and a training set assembled by a radiologist, we train a Support Vector Machine (SVM) to recognise some typical patterns found on the scans and test the accuracy on the training set using cross-validation. Then, larger areas deemed broadly representative of each of the patterns under consideration were labelled on the 102 images and the classiﬁcation accuracy for each pattern and each block size is presented. Using the classiﬁed images, we segment the lung regions using a variation of the normal method. Finally, we fuse the results from the 3 block sizes to form a single image using Naive Bayes and show this matches or improves on the accuracy using each of the individual block sizes alone. I. I NTRODUCTION The development of a computer aided diagnosis (CAD) system for detecting disease from CT scans of the lung has received increasing attention in the last few years, in part no doubt due to the advances made in the scanning machines which enable more and increasingly accurate information to be extracted during a single breath of the patient. High- resolution computed tomography (HRCT) can produce two to three hundred scans and some time is required for two radiologists to examine the scans. In common with Uchiyama et al [4], we suggest the ﬁnal aim is a system to provide an automated, or interactive, second opinion to the radiologist. In previous similar work, Uppaluri et al [1] developed a system that recognises honeycombing, ground-glass, bron- chovascular, nodular, emphysematous and normal tissue in 72 subjects - 20 normal, 13 with emphysema, 19 with IPF and 20 with sarcoidosis. The data was split in half to obtain a training and a test set and an overall accuracy of 93.5% was obtained on the test set. Delorme et al [2] classiﬁed normal, emphysema, ground-glass, intraloblar ﬁbrosis and vessels us- ing 5x5 pixel blocks and from 5 patients and 70.7% were classiﬁed correctly. Heitmann et al [3] used 120 scans from 20 patients and a hybrid of the self-organising neural networks and simple expert rules to classify ground-glass opacities on HRCT. Although more detailed results are outlined, this was broadly successful on 91 of the 120 scans. More recently, in Uchiyama et al [4], regions on 315 HRCT images from 105 patients were labelled by 3 radiologists as ground-glass, reticular and linear opacities, nodular, honeycombing, emphy- sema and consolodation, and when there was unanimity, the regions were used as “gold standard” data and divided into contiguous 32x32 pixel blocks, although 96x96 blocks are also used to classify the 32x32 block at it’s centre. The lungs were segmented where possible using the standard technique of a morphological ﬁlter and thresholding (although in cases of severe consolodation, a manual method was used) and divided in 32x32 regions of interest and classiﬁed. The accuracy of detecting each of the abnormal patterns was between 88 and 100%, with a speciﬁcity of 88.1% in detecting normal tissue. Finally, Sluimer et al [5] aimed to distinguish between normal and abnormal tissue and use 657 regions of interest from 116 patients, each labelled as containing normal or abnormal tissue. The ROIs were circular with a radius of 80 pixels and all from the same height in the lung (the aortic arch). Each ROI was required to contain at least 75% abnormal tissue. All experiments were performed as cross-validations, dividing the data set into 4 and obtaining 4 sets of results. The results are comparable to those of a radiologist both when evaluating only the ROIs ie. without seeing the whole scan, and seeing the whole scan also. To add to this research, we: 1) use 3 different block sizes simultaneously and fuse the results to maximise classiﬁcation accuracy. 2) we classify the areas outside the lungs by training also on tissue, fat and bone. This enables us to segment using a slightly different method than normal. II. THE DATA The clinical cases under evaluation were selected from daily practice in the Department of Clinical Radiology at Bristol Royal Inﬁrmary. A total of 102 images are included in this test, 3 slices from each of the 34 patients, 1 each from the apex, base and level of the main bronchi. Of these 34 patients, 11 had normal lungs, 13 had ﬁbrosis and 13 emphysema/bullae (3 have both diseases). The scan width of 28 of the patients was either 6mm or 8mm, and for the remaining 6, two from each 1620 0-7803-8622-1/04/$20.00 ©2004 IEEE