Digital Mammograms Classification Using a Wavelet Based Feature Extraction Method Ibrahima Faye 1 , Brahim Belhaouari Samir 2 , Mohamed M. M. Eltoukhy 3 1, 2 Fundamental and Applied Sciences Department, 3 Electrical and Electronic Engineering Department Universiti Teknologi PETRONAS, Perak, Malaysia. 1 ibrahima_faye@petronas.com.my, 2 brahim_balhaouari@petronas.com.my, 3 tokhy2478@yahoo.com Abstract— This paper introduces a new method of feature extraction from Wavelet coefficients for classification of digital mammograms. A matrix is constructed by putting Wavelet coefficients of each image of a building set as a row vector. The method consists then on selecting by threshold, the columns which will maximize the Euclidian distances between the different class representatives. The selected columns are then used as features for classification. The method is tested using a set of images provided by the Mammographic Image Analysis Society (MIAS) to classify between normal and abnormal and then between benign and malignant tissues. For both classifications, a high accuracy rate (98%) is achieved. Keywords-component; Breast cancer; Wavelet tranform; Feature extraction; Digital mammogram. I. INTRODUCTION Cancer is a leading cause of death worldwide; it is accounted for 7.4 million deaths (around 13% of all deaths) in 2004. More than 70% of all cancer deaths occurred in low and middle income countries. Deaths from cancer worldwide are projected to continue rising, with an estimated 12 million deaths in 2030 [1]. According to published statistics of World Health Organization (WHO), there were 519,000 deaths from Breast cancer in 2004 [1]. The reason of cancer is unknown until now, and, there is no way to prevent it. Early detection and treatment are considered as the most promising approaches to reduce breast cancer mortality [2]. Mammogram image is considered as the most reliable, low cost, and highly sensitive technique for detecting small lesions. The radiologists are searching for signs of abnormality, but the signs of early disease are often small or subtle. That is the main cause of many missed diagnoses that can be mainly attributed to human factors such as subjective or varying decision criteria, distraction by other image features, or simple oversight [3]. Nevertheless, a false positive detection causes unnecessary biopsy. It has been estimated that only 20–30% of breast biopsy cases are proved to be cancerous [4, 5]. On the other hand, in a false negative detection, an actual tumor remains undetected. Studies have shown that 10– 30% of the visible cancers are undetected [3]. Thus, there is a significant necessity for developing methods for automatic classification of suspicious areas in mammograms for aiding radiologists to improve the efficacy of screening programs and avoiding unnecessary biopsies. Computer aided detection (CAD) systems, which use computer technologies to detect abnormalities in mammogram such as microcalcification, mass, architecture distortion and asymmetry, can play a key role in early detection of breast cancer and help to reduce the mortality rate among women with breast cancer [6]. Computer aided methods in the field of digital mammography are divided into two main categories: computer aided detection methods that are capable of pinpointing suspicion regions in mammograms for further analysis from an expert radiologist and computer aided diagnosis methods which are capable of making a decision whether the examined suspicion regions consist of abnormal or healthy tissue and distinguishing between malignant and benign. Computer aided detection systems (CAD) for detecting masses or micro-calcifications in mammograms have already been used and proven to be a potentially powerful tool [7], so the radiologists are attracted by the effectiveness of clinical application of CAD systems [6]. One of the main points that should be taken under serious consideration when implementing a robust classifier for recognizing breast tissue is the selection of the appropriate features that describe and highlight the differences between the abnormal and the normal tissue in an ample way. Feature extraction is an important factor that directly affects the classification result in mammogram classification. Most systems extract features to detect and classify the abnormality as benign or malignant from the textures, statistical properties, spatial domain, fractal domain and wavelet bases [8]. Classification of malignant and benign is still very challenging and a difficult problem for researchers. Researchers spend a lot of time in attempting to find a group of features that will aid them in improving the classification for malignant from benign. There are various feature transforms that serve to condense input data and to reduce redundancies by highlighting important characteristics of the image [9]. Texture is a commonly used feature in the analysis and interpretation of images. Oliver [10] distinguishes textures employed in mammography according to three main extraction methods: (1) Statistical methods: The extracted features of this class include those obtained from co-occurrence matrices, from surface variation measurements (smoothness, coarseness and regularity) [10]. (2) Model-based methods: The analysis of texture features in this class is based on prior models such as Markov random fields, auto-regressive models and fractals [11]. (3) Signal processing methods: In this class, texture features are obtained according to either pixel characteristics or