Improving FCM and T2FCM Algorithms
Performance using GPUs for Medical Images
Segmentation
Mohammed A. Shehab, Mahmoud Al-Ayyoub and Yaser Jararweh
Jordan University of Science and Technology
Irbid, Jordan
Emails: mashehab12@cit.just.edu.jo, {maalshbool, yijararweh}@just.edu.jo
Abstract—Image segmentation gained popularity recently due
to numerous applications in many fields such as computer vision,
medical imaging. From its name, segmentation is interested
in partitioning the image into separate regions where one of
them is of special interest. Such region is called the Region of
Interest (RoI) and it is very important for many medical imaging
problems. Clustering is one of the segmentation approaches
typically used on medical images despite its long running time.
In this work, we propose to leverage the power of the Graphics
Processing Unit (GPU)to improve the performance of such
approaches. Specifically, we focus on the Fuzzy C-Means (FCM)
algorithm and its more recent variation, the Type-2 Fuzzy C-
Means (T2FCM) algorithm. We propose a hybrid CPU-GPU
implementation to speed up the execution time without affecting
the algorithm’s accuracy. The experiments show that such an
approach reduces the execution time by up to 80% for FCM
and 74% for T2FCM.
Index Terms—Medical Image Segmentation; Fuzzy C-Means;
Type-2 Fuzzy C-Means; GPU; CUDA
I. I NTRODUCTION
Recently, medical image processing (for the different ex-
isting modalities such as magnetic resonance imaging (MRI),
computed tomography (CT), digital mammography, etc.) has
become more popular due to its obvious benefits in the
diagnosis of many diseases. Researchers are continuously
trying to come up with more accurate and efficient techniques
[1]. However, due to the recent advances in medical image
modalities and the increased size and resolution of medical
images, the processing capabilities of typical CPUs are not
longer suitable. A recent trend is to exploit the capabilities
of Graphics Processing Unit (GPU) in order to improve the
performance of medical image processing tasks [2], [3], [4].
Image segmentation is one of the fundamental tasks in
image processing. It focuses on how to extract objects from
images. It separates different regions of the image where
one region is of special interest. Such region is called the
Region of Interest (RoI) and it is very important for many
medical imaging problems [5], [6]. For example, segmentation
is an integral step in many Computer-Aided Diagnosis (CAD)
systems [7], [8], [9], [10]. Many approaches were proposed for
this task such as threshold-based methods, clustering methods,
compression-based methods, histogram-based methods and
region-growing methods [11], [12], [13], [14]. We focus here
on the clustering techniques for segmentation. Specifically,
we are concerned with the celebrated Fuzzy C-Means (FCM)
technique [15].
Due to its importance, several enhancements of FCM ap-
peared over the past three decades trying to improve the accu-
racy and performance of FCM. For the latter objective, [16],
[1], [17], [18] proposed to use GPU capabilities. GPUs use
single instruction multiple data (SIMD) parallel programming.
While both CPUs and GPUs can run and manage thousands
of threads simultaneously via time-slicing, modern CPUs can
run 4-12 threads in parallel, whereas GPUs can run a thousand
threads at a time [1], [19].
In this work, we show how to improve the performance of
FCM as well as a variation of it called Type-2 Fuzzy C-Means
(T2FCM) using GPU. Following the finding of [1], we devise
a hybrid CPU-GPU implementation and compare it with CPU
implementation on two medical images.
The structure of this paper is as follows. The following
section briefly discuss a few similar works. Section III presents
our methodology which involves discussing the sequential as
well as the hybrid implementations and Section IV discusses
the experiments we conducted and the results we obtained.
Finally, we conclude our work and provide some directions
for future researchers.
II. RELATED WORKS
Most research efforts focused on improving the accuracy of
FCM with some researchers focusing on how to improve the
performance of FCM. For instance, Rowi´ nska et al. [16] im-
plemented FCM on a parallel architecture. They used CUDA
to convert the sequential code of FCM to a parallel one. The
testing data were composed of different colored images with
different sizes. They transferred two main functions of FCM
to be executed on GPU platform. The membership matrix and
calculating new centroids were running on the GPU side while
the objective function and the termination condition were
running on the CPU side. Their experiments were conducted
on an Intel Core i3 machine with NVIDIA GeForce GTX
560 video card and Windows 7 64-bit operating system.
Their CUDA implementation was tested against two sequential
implementations (in C++ and MATLAB). Two types of ex-
periments were conducted with one-/two-dimensional feature
spaces. The GPU implementation was 7 times faster than the
2015 6th International Conference on Information and Communication Systems (ICICS)
978-1-4799-7348-4/15/$31.00 ©2015 IEEE