Classifying Digital Mammogram Masses using Univariate ANOVA Discriminant Analysis B.Surendiran 1 , Y.Sundaraiah 2 , A.Vadivel 3 1 surendiran@gmail.com, 2 sundarnitt@yahoo.com, 3 vadi@nitt.edu Abstract—An Univariate Analysis Of Variance (ANOVA) Discriminant Analysis (DA) classifier is proposed for classifying the masses present in mammogram. This approach combines the 19 shape properties of the mass regions and classifies the masses as benign or malignant using Univariate ANOVA. The experiment is performed on DDSM database images. Experimental results shows that the proposed method reaches high classification accuracy in compared to existing algorithms. Keywords—Discriminant analysis, Digital Mammogram, Shape properties, Classifying as Benign or Malignant, Univariate ANOVA. I. INTRODUCTION The breast cancer is the leading cause of death in female population. Every 3 minutes, a woman is diagnosed with breast cancer and in every 13 minutes a woman dies from breast cancer [1]. The exact cause of breast cancer is unknown and best known prevention is precautious diagnosis. Mammography is best known technique for early breast cancer detection. Breast cancer death rates have been dropping steadily since 1995, due to earlier detection and increased use of mammography [1]. Computer Aided Detection (CAD) systems have been developed to aid radiologists in diagnosing cancer from digital mammograms. Several studies have proved that CAD improves breast cancer diagnostic accuracy rate by 14.2% [2]. Malignant and benign masses are abnormal/tumor cells present in the breast. While malignant are treated as cancerous tumors and benign are non-cancerous. Various shape features like shape, size, margins (borders), etc has been used to characterize the abnormalities or masses present in mammogram. These shape features agrees to the standard specified by Breast Imaging Reporting and Data System (BI- RADS) [3]. Benign masses posses round, oval in shape and have smooth, circumscribed margins. Whereas malignant masses posses irregular shape and have ill-defined, microlobulated or spiculated margins. It is observed that shape and margin characteristics play important role in classifying the mass as benign or malignant. Previous approaches which classify the abnormalities based on BI-RADS system have been giving accurate results [4, 5]. So, we have concentrated on shape and margin properties of the masses. Previous approaches use few statistical or shape based features. The methods that classify the mammogram mass using statistical features use gray value or histogram of mammogram to classify masses [6]. The grey values of mammogram tend to change, if it is over-enhanced or in the presence of noise. The classification rate obtained by the statistical based classifiers that use histogram/gray values is 70% [7]. Most of the existing works have been concentrated on classifying the region as normal or abnormal using shape features with Neural Network classifiers have obtained good accuracy [8, 9]. But, most of previous approaches which classify the mass as benign or malignant are not able to get very good classification rate. In [10], they used complex Bayesian neural networks classifier and co-occurrence matrix for 5 statistical measures to classify the region as benign or malignant. They tested with small dataset containing only 17 sample mammograms and had achieved max of 81% accuracy for classifying the mass as benign or malignant. As shape based classifiers give better results, we had used 19 shape properties for classifying the mass as benign or malignant and we are able to get high accuracy rate using univariate ANOVA (ANalysis Of VAriance) discriminant analysis classifier. This work is organized as follows. In Section 2, we present techniques for feature extraction using shape properties. Next in Section 3, we discuss about univariate ANOVA discriminant analysis classification method. In section 4, we present the results of our experiments. In section 5, we conclude the paper. II. MASS SHAPE FEATURE EXTRACTION A. Mass Shape Characteristics Figure 1. Shape Characteristics of Masses The Fig.1 shows the mass shapes of mammogram specified by BIRADS system. Benign masses have round and oval shapes with circumscribed margin. Malignant masses have irregular shape with ill-defined, microlobulated or spiculated margins. B. Shape Properties For the Experiments we have used mammograms from DDSM Database [11]. The ground truth available with each mammogram is used for measuring the classification rate. 2009 International Conference on Advances in Recent Technologies in Communication and Computing 978-0-7695-3845-7/09 $25.00 © 2009 IEEE DOI 10.1109/ARTCom.2009.33 175 2009 International Conference on Advances in Recent Technologies in Communication and Computing 978-0-7695-3845-7/09 $26.00 © 2009 IEEE DOI 10.1109/ARTCom.2009.33 175 2009 International Conference on Advances in Recent Technologies in Communication and Computing 978-0-7695-3845-7/09 $26.00 © 2009 IEEE DOI 10.1109/ARTCom.2009.33 175 Multimedia Information Retrieval Group, National Institute of Technology, Tiruchirappalli, India