2014 11th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE)
978-1-4799-6230- 3/14/$31.00 ©2014 IEEE
Automatic Segmentation of Mammograms Using a
Scale-Invariant FeatureTransform and K-Means
Clustering algorithm
Luis. A. Salazar-Licea*, C. Mendoza, M.A. Aceves,
J.C. Pedraza
Facultad de Informática,
Universidad Autónoma de Querétaro
Queretaro, Mexico
*Corresponding author: l.antonyo.al@gmail.com
Alberto Pastrana-Palma
División de Estudios de Posgrado
Facultad de Contaduría y Administración,
Universidad Autónoma de Querétaro
Queretaro, Mexico
Abstract— In this work, a Scale-Invariant Feature
Transform method, together with a K-means clustering is used
in order to find regions of interest (ROI’s) in mammograms.
This paper focuses on presenting a tool that can improve the
search of suspicious areas that contain abnormalities, leaving
the final decision to the radiologist. The methodology is divided
into three sections: first, a pre-processing step that consist in
acquiring image and reduction its size erasing the background
leaving only the breast area and eliminating noise. The second
step is to improve the image quality through image
thresholding and histogram equalization limited contrast
(CLAHE). Last step of the methodology is the location of
regions of interest in the image and is done using Scale-
Invariant Feature Transform (SIFT) as the main tool and is
complemented with Binary Robust Independent Elementary
Features (BRIEF) to find descriptors and as classifier K-Means
Clustering. Finally in the results are presented the location of
ROI’s and they are compared with the position of
abnormalities diagnosed by the Mammographic Image
Analysis Society.
Keywords—mammogram; image processing; segmentation;
SIFT.
I. INTRODUCTION
Breast cancer consists in a disordered and abnormal
growth of breast cells. The World Health Organization
estimates that about 84 million people will die because this
disease between 2005 and 2015. In Mexico, since 2006
breast cancer is the second highest cause of death in the age
group 30 to 54 years, and ranks as the first cause of mortality
from malignant tumors in women [2]. A mammogram is a
radiographic test non-invasive of the mammary gland that
can detect cancer up to two years before it can be felt and can
reduce mortality up to 30% [3].
Among the disadvantages of this technique to make a
diagnosis are: low differentiation in the appearance of
cancerous tissue compared with normal parenchymal tissue;
varied morphology of the findings; similarity between the
morphologies of the findings; varied size of the findings;
deficiencies in the skill to make the radiograph and visual
fatigue or distraction of the radiologist. [4]
Typical steps of computer-assisted diagnosis are (Fig.1)
[5]: Pre-processing which aim is to increase the image
quality and reduce noise; Segmentation step its objective is to
find regions of interest (ROI's) suspicious of containing
anomalies; Detection step selects the best set of features in
the region of interest and finally, based on the detection, is
carried out the reducing of false positive and lesion
classification.
Fig. 1. Typical steps of the computer diagnostic
II. METHODS AND MATERIALS
This section is divided into three main sub sections:
image preprocessing, image enhancement and location of
regions of interest where are described all the methods used
to create this work but first there is a part that described the
materials and tools used.
A. Materials
The mammographic images analyzed in this work belong
to the mini-MIAS (Mammographic Image Analysis Society)
database from the UK National Breast Screening Programme
[6]. This database contains 322 images; each image has a 200
micron pixel edge and 1024x1024 pixels of size. All
developed algorithms were implemented entirely using open
source tools such as Python language programming, Eclipse
IDE and OpenCV libraries.
B. Image Preprocessing
This stage consists of perform an image pre-segmentation
to acquire and select only the breast area and to eliminate
noise in order to reduce processing time. The corner detector