A NOVEL FRAMEWORK FOR N-D MULTIMODAL IMAGE SEGMENTATION USING GRAPH CUTS Asem M. Ali Aly A. Farag Computer Vision and Image Processing Laboratory (CVIP Lab) University of Louisville, Louisville, KY 40292 {asem,farag}@cvip.uoﬂ.edu, URL: www.cvip.uoﬂ.edu * ABSTRACT This work proposes a new MAP-based segmentation framework of multimodal images. In this work a joint MGRF model is used to describe the image. The main focus here is a more accurate model identiﬁcation. For a known number of classes in the given image, the empirical distributions of this image signals are precisely approxi- mated by a LCG distributions with positive and negative components. Gibbs potential, which is used to identify the spatial interaction be- tween the neighboring pixels, is analytically estimated. Finally, an energy function using the previous models is formulated and is glob- ally minimized using graph cuts. Experiments show that the devel- oped technique gives promising accurate results compared to other known algorithms. Index Terms— MRF, Graph Cut, LCG 1. INTRODUCTION Segmentation is a fundamental problem in image processing. This paper addresses the problem of accurately unsupervised segmenting multimodal grayscale images. There are many simple techniques, such as region growing or thresholding, for multimodal image seg- mentation. Although these techniques are widely known due to their simplicity and speed, but no accurate segmentation can be achieved using these techniques. This is because these techniques depend only on the marginal probability distributions, and in most cases signal ranges for different object overlap. Energy-based segmentation ap- proaches are more robust algorithms. Our proposed framework uses graph cuts as a powerful optimization technique to get the optimal segmentation. Shi and Malik [1] proposed the normalized cut criteria, a measure of both the total dissimilarity between the different image regions as well as the total similarity within the image regions, for graph partitioning. To compute the minimum cut which corresponds to optimum segmentation, they used an eigenvalue system. Boykov and Jolly [2] proposed a framework that used s/t graph cuts to get a globally optimal object extraction method for N-dimensional images. They minimized a cost function that combines region and boundary properties of segments as well as topological constraints. That work illustrated the effectiveness of formulating the object segmentation problem via graph cuts. Since Boykov and Jolly introduced their graph cuts segmentation technique in their seminal paper [2], it be- came one of the leading approaches in interactive N-D image segmen- tations, and many publications extended this work in different direc- tions. For more details see [3] and references therein. These works showed the power of graph cuts as a tool for image segmentation, ∗ This research has been supported by US National Science Foundation Grant IIS-0513974. since it optimizes energy functions that can integrate regions, bound- ary, and shape information. Also, the graph cuts technique offers a reliable and robust globally optimal object segmentation method. Most of these works are interactive segmentation. Although inter- active segmentation imposes some topological constraints reﬂecting certain high-level contextual information about the object, it depends on the user input. The user inputs have to be accurately positioned. Otherwise the segmentation results are changed. This paper proposes a Maximum-A-Posterior (MAP)-based seg- mentation approach for multimodal grayscale image segmentation. To model the low level information in the multimodal image, we pre- cisely approximate the empirical distributions of the image signals by a Linear Combination of Gaussian (LCG) distributions with posi- tive and negative components. For accurate model identiﬁcation, the image is described by a joint Markov Gibbs Random Field (MGRF) model. Then the spatial interaction potential for this MGRF model is analytically estimated. Finally, we use LCG and MGRF models to formulate an energy function, which is globally minimized using graph cuts. 2. PROPOSED FRAMEWORK The weighted undirected graph G = 〈V , E〉 is a set of vertices V , and a set of edges E connecting the vertices. Each edge is assigned a non- negative weight. The set of vertices V corresponds to the set of image pixels P , and two specially terminal vertices s (source/object), and t (sink/background). The set of edges E consists of two subsets: n- links, the edges connecting the neighboring pixels in the image, and t-links, the edges connecting the pixels with the terminals. An s/t cut, which can be computed efﬁciently in low-order polynomial time, divides the set of image pixels into two subsets, background and ob- ject. Consider a neighborhood systemN of all unordered pairs {p, q} of neighboring pixels in P . Let L the set of labels {0, 1, .., K}, cor- respond to image modes. Labelling is a mapping from P to L, and we denote the set of labelling by f = {f1,...,fp,...,f |P| }. In other words, the label fp, which is assigned to the pixel p ∈P , classiﬁes it to one of the labels. Now the goal is to ﬁnd the best labelling f , optimal segmentation, by minimizing the following function E(f )= X p∈P Dp(fp)+ X {p,q}∈N V (fp,fq ), (1) where Dp(fp), measures how much assigning a label fp to pixel p disagrees with the pixel intensity, Ip. A good example for Dp(fp) represents the regional properties of segments Dp(fp)= −ln P (Ip | fp). (2) We estimate the empirical distribution P (.) of each class as shown in Sec. 2.1. The second term is the pairwise interaction model which 729 978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008