2014 11th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE) 978-1-4799-6230- 3/14/$31.00 ©2014 IEEE Automatic Segmentation of Mammograms Using a Scale-Invariant FeatureTransform and K-Means Clustering algorithm Luis. A. Salazar-Licea*, C. Mendoza, M.A. Aceves, J.C. Pedraza Facultad de Informática, Universidad Autónoma de Querétaro Queretaro, Mexico *Corresponding author: l.antonyo.al@gmail.com Alberto Pastrana-Palma División de Estudios de Posgrado Facultad de Contaduría y Administración, Universidad Autónoma de Querétaro Queretaro, Mexico AbstractIn this work, a Scale-Invariant Feature Transform method, together with a K-means clustering is used in order to find regions of interest (ROI’s) in mammograms. This paper focuses on presenting a tool that can improve the search of suspicious areas that contain abnormalities, leaving the final decision to the radiologist. The methodology is divided into three sections: first, a pre-processing step that consist in acquiring image and reduction its size erasing the background leaving only the breast area and eliminating noise. The second step is to improve the image quality through image thresholding and histogram equalization limited contrast (CLAHE). Last step of the methodology is the location of regions of interest in the image and is done using Scale- Invariant Feature Transform (SIFT) as the main tool and is complemented with Binary Robust Independent Elementary Features (BRIEF) to find descriptors and as classifier K-Means Clustering. Finally in the results are presented the location of ROI’s and they are compared with the position of abnormalities diagnosed by the Mammographic Image Analysis Society. Keywordsmammogram; image processing; segmentation; SIFT. I. INTRODUCTION Breast cancer consists in a disordered and abnormal growth of breast cells. The World Health Organization estimates that about 84 million people will die because this disease between 2005 and 2015. In Mexico, since 2006 breast cancer is the second highest cause of death in the age group 30 to 54 years, and ranks as the first cause of mortality from malignant tumors in women [2]. A mammogram is a radiographic test non-invasive of the mammary gland that can detect cancer up to two years before it can be felt and can reduce mortality up to 30% [3]. Among the disadvantages of this technique to make a diagnosis are: low differentiation in the appearance of cancerous tissue compared with normal parenchymal tissue; varied morphology of the findings; similarity between the morphologies of the findings; varied size of the findings; deficiencies in the skill to make the radiograph and visual fatigue or distraction of the radiologist. [4] Typical steps of computer-assisted diagnosis are (Fig.1) [5]: Pre-processing which aim is to increase the image quality and reduce noise; Segmentation step its objective is to find regions of interest (ROI's) suspicious of containing anomalies; Detection step selects the best set of features in the region of interest and finally, based on the detection, is carried out the reducing of false positive and lesion classification. Fig. 1. Typical steps of the computer diagnostic II. METHODS AND MATERIALS This section is divided into three main sub sections: image preprocessing, image enhancement and location of regions of interest where are described all the methods used to create this work but first there is a part that described the materials and tools used. A. Materials The mammographic images analyzed in this work belong to the mini-MIAS (Mammographic Image Analysis Society) database from the UK National Breast Screening Programme [6]. This database contains 322 images; each image has a 200 micron pixel edge and 1024x1024 pixels of size. All developed algorithms were implemented entirely using open source tools such as Python language programming, Eclipse IDE and OpenCV libraries. B. Image Preprocessing This stage consists of perform an image pre-segmentation to acquire and select only the breast area and to eliminate noise in order to reduce processing time. The corner detector