A NOVEL FRAMEWORK FOR N-D MULTIMODAL IMAGE SEGMENTATION USING GRAPH
CUTS
Asem M. Ali Aly A. Farag
Computer Vision and Image Processing Laboratory (CVIP Lab)
University of Louisville, Louisville, KY 40292
{asem,farag}@cvip.uofl.edu, URL: www.cvip.uofl.edu
*
ABSTRACT
This work proposes a new MAP-based segmentation framework of
multimodal images. In this work a joint MGRF model is used to
describe the image. The main focus here is a more accurate model
identification. For a known number of classes in the given image, the
empirical distributions of this image signals are precisely approxi-
mated by a LCG distributions with positive and negative components.
Gibbs potential, which is used to identify the spatial interaction be-
tween the neighboring pixels, is analytically estimated. Finally, an
energy function using the previous models is formulated and is glob-
ally minimized using graph cuts. Experiments show that the devel-
oped technique gives promising accurate results compared to other
known algorithms.
Index Terms— MRF, Graph Cut, LCG
1. INTRODUCTION
Segmentation is a fundamental problem in image processing. This
paper addresses the problem of accurately unsupervised segmenting
multimodal grayscale images. There are many simple techniques,
such as region growing or thresholding, for multimodal image seg-
mentation. Although these techniques are widely known due to their
simplicity and speed, but no accurate segmentation can be achieved
using these techniques. This is because these techniques depend only
on the marginal probability distributions, and in most cases signal
ranges for different object overlap. Energy-based segmentation ap-
proaches are more robust algorithms. Our proposed framework uses
graph cuts as a powerful optimization technique to get the optimal
segmentation. Shi and Malik [1] proposed the normalized cut criteria,
a measure of both the total dissimilarity between the different image
regions as well as the total similarity within the image regions, for
graph partitioning. To compute the minimum cut which corresponds
to optimum segmentation, they used an eigenvalue system. Boykov
and Jolly [2] proposed a framework that used s/t graph cuts to get a
globally optimal object extraction method for N-dimensional images.
They minimized a cost function that combines region and boundary
properties of segments as well as topological constraints. That work
illustrated the effectiveness of formulating the object segmentation
problem via graph cuts. Since Boykov and Jolly introduced their
graph cuts segmentation technique in their seminal paper [2], it be-
came one of the leading approaches in interactive N-D image segmen-
tations, and many publications extended this work in different direc-
tions. For more details see [3] and references therein. These works
showed the power of graph cuts as a tool for image segmentation,
∗
This research has been supported by US National Science Foundation
Grant IIS-0513974.
since it optimizes energy functions that can integrate regions, bound-
ary, and shape information. Also, the graph cuts technique offers
a reliable and robust globally optimal object segmentation method.
Most of these works are interactive segmentation. Although inter-
active segmentation imposes some topological constraints reflecting
certain high-level contextual information about the object, it depends
on the user input. The user inputs have to be accurately positioned.
Otherwise the segmentation results are changed.
This paper proposes a Maximum-A-Posterior (MAP)-based seg-
mentation approach for multimodal grayscale image segmentation.
To model the low level information in the multimodal image, we pre-
cisely approximate the empirical distributions of the image signals
by a Linear Combination of Gaussian (LCG) distributions with posi-
tive and negative components. For accurate model identification, the
image is described by a joint Markov Gibbs Random Field (MGRF)
model. Then the spatial interaction potential for this MGRF model
is analytically estimated. Finally, we use LCG and MGRF models
to formulate an energy function, which is globally minimized using
graph cuts.
2. PROPOSED FRAMEWORK
The weighted undirected graph G = 〈V , E〉 is a set of vertices V , and
a set of edges E connecting the vertices. Each edge is assigned a non-
negative weight. The set of vertices V corresponds to the set of image
pixels P , and two specially terminal vertices s (source/object), and t
(sink/background). The set of edges E consists of two subsets: n-
links, the edges connecting the neighboring pixels in the image, and
t-links, the edges connecting the pixels with the terminals. An s/t
cut, which can be computed efficiently in low-order polynomial time,
divides the set of image pixels into two subsets, background and ob-
ject. Consider a neighborhood systemN of all unordered pairs {p, q}
of neighboring pixels in P . Let L the set of labels {0, 1, .., K}, cor-
respond to image modes. Labelling is a mapping from P to L, and we
denote the set of labelling by f = {f1,...,fp,...,f
|P|
}. In other
words, the label fp, which is assigned to the pixel p ∈P , classifies
it to one of the labels. Now the goal is to find the best labelling f ,
optimal segmentation, by minimizing the following function
E(f )=
X
p∈P
Dp(fp)+
X
{p,q}∈N
V (fp,fq ), (1)
where Dp(fp), measures how much assigning a label fp to pixel p
disagrees with the pixel intensity, Ip. A good example for Dp(fp)
represents the regional properties of segments
Dp(fp)= −ln P (Ip | fp). (2)
We estimate the empirical distribution P (.) of each class as shown in
Sec. 2.1. The second term is the pairwise interaction model which
729 978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008