Precise Image Segmentation by Iterative EM-Based Approximation of Empirical Grey Level Distributions with Linear Combinations of Gaussians Aly A. Farag, Ayman El-Baz Georgy Gimel’farb CVIP Laboratory CITR, Tamaki Campus Dept. of Electrical and Computer Eng. Dept. of Computer Science University of Louisville University of Auckland Louisville, KY 40292 Auckland, New Zealand Abstract A new algorithm for segmenting a multi-modal grey-scale image is proposed. The image is described as a sample of a joint Gibbs random ﬁeld of region labels and grey val- ues. To initialize the model, a multi-modal mixed empiri- cal grey level density distribution is approximated with sev- eral linear combinations of Gaussians, one linear combina- tion per region. Bayesian decisions involving Expectation– Maximization and genetic optimization techniques are used to sequentially estimate and reﬁne parameters of the model, including the number of Gaussians for each region. The ﬁnal estimates are more accurate than with conventional normal mixture models and result in more adequate region borders in the image. Experiments with simulated and real medical CT images conﬁrm the accuracy of our approach. 1. Introduction A large number of image segmentation methods based on estimating marginal probability densities of signals and sep- arating their dominant modes have been developed and tested for last three decades (see [1–5] to cite a few). How- ever many important applications such as medical image analysis or industrial vision still encounter difﬁculties in separating practically meaningful continuous or disjoint ob- jects, even when signal densities are distinct to the point where their mixture becomes strongly multimodal. The ba- sic issue is the accuracy of region borders, which usually are essential for correct interpretation of the objects. The borders are formed by intersecting tails of the signal densi- ties for the objects. Therefore, it is the tails that have to be precisely estimated in order to separate, e.g., a darker ob- ject from a brighter background. One of the practical prob- lems inspired our approach is to accurately detect the lungs region in a Spiral CT chest slice such that detected lungs borders closely match those outlined by a radiologist. Because there always exist an overlap between the sig- nal ranges for the different objects, the precise segmenta- tion has to account for spatial distributions of the signals, too. Markov-Gibbs random ﬁeld models show considerable promise in spatial image analysis [6–13]. Thus we consider images to be segmented as samples of a two-level Gibbs random ﬁeld of more or less continuous regions (the higher level) and grey values in each region (the lower level) [7, 9– 11]. We choose for each level the simplest probability mod- els ensuring the necessary precision of the segmentation. At the lower level, the signals are described by a condition- ally independent random ﬁeld of grey values having differ- ent probability densities for the regions. In practice, the marginal probability distributions of grey levels in each re- gion are quite intricate. Hence we represent each density as a linear combination of Gaussians with positive and nega- tive components [14–16] which is more accurate than a con- ventional normal mixture with only the positive ones. Then a mixed empirical density distribution over the image is ap- proximated with a mixture of several linear combinations of Gaussians. The lower-level segmentation on the basis of the estimated densities is reﬁned further at the higher level using the Bayesian maximum a posteriori (MAP) decision about the segmentation map for the joint Gibbs model of the maps and images. We propose a new sequential EM-based algorithm to es- timate parameters of each linear combination of Gaussians, including their number, that closely approximates a given multi-modal empirical probability distribution of signals. For simplicity sake, we restrict our consideration below to only a bimodal signal density describing a dark object and its bright background. But the extension of the proposed segmentation scheme onto the multi-modal case is straight- forward. In the bi-modal case, the empirical mixed density is split ﬁrst into two dominant positive Gaussians and a set of secondary alternating Gaussians. These latter describe deviations of the empirical distribution from the dominant components. Both the dominant and secondary terms are sequentially estimated using the EM-algorithm [17–19, 21, 22], or more speciﬁc, its early variant for the normal mix- tures [20] (see also [22]). To initially segment the image, 0-7695-2158-4/04 $20.00 (C) 2004 IEEE