Evolutionary Maximum Likelihood Image Compression Mohamed M. Tawfick Mentor Graphics Egypt 78 Elnozha St., Heliopolis Cairo 11361, Egypt mohamed_moharam@mentor.com Hazem M. Abbas Mentor Graphics Egypt 78 Elnozha St., Heliopolis Cairo 11361, Egypt hazem_abbas@mentor.com Hussein I. Shahein Ain Shams University Dept. Computer & Systems Eng. 1 ElSarayat St., Abbassia Cairo 11571, Egypt ABSTRACT This work outlines an evolutionary algorithm for image vector quantization. An integer-coded genetic algorithm (GA) that employs the maximum likelihood (ML) measure as the fitness function is introduced. The proposed algo- rithm allows for different chromosome representations and provides an adaptation to the genetic operators to suit the image quantization problem. The main objective of the al- gorithm is, for a codebook with a pre-defined size, to find the best set of image blocks that make up the codewords. Each codeword will be representative of a group of blocks. The final codebook is formed from the set of groups’ av- erages. Simulation results show the effectiveness of the al- gorithm especially when compared with the famous LBG vector quantizer. Categories and Subject Descriptors: I.2.m[Computing Methodologies:]Artificial Intelligence-Miscellaneous- Evo- lutionary computing and genetic algorithms General Terms: Algorithms, Experimentation, Performance Keywords: Evolutionary Algorithms, Image Compression, Clustering, Maximum Likelihood 1. INTRODUCTION Vector quantization image coding is used to reduce the data storage or transmission bit-rate while maintaining an acceptable image quality. The main component of vector quantization is the codebook which is a group of code vec- tors (codewords ). Compression is then achieved by stor- ing/transmitting the codeword index instead of the code- word itself. The performance of vector quantization is highly dependent on the used codebook. An optimal codebook is dependent on the nature of the image to be compressed, its dimensions, and the required rate (codebook size) to be used. The problem addressed here, is the problem of de- signing a global codebook for several sets of training images, with the intention that for each set of images, the algorithm will run only once for each desired codebook size. A new algorithm for designing a global codebook is described that is based on an integer coded Genetic Algorithm (GA) with a maximum likelihood (ML) measure as the fitness function and problem-dependent genetic operators. Copyright is held by the author/owner(s). GECCO’09, July 8–12, 2009, Montréal Québec, Canada. ACM 978-1-60558-325-9/09/07. 2. PROPOSED GA-ML VQ ALGORITHM The LBG algorithm [1] (also known as Generalized Lloyd Algorithm) is by far the most popular vector quantizer. It is based on performing a finite sequence of steps in which, at every step, a new quantizer, with a total distortion less than or equal to the previous one, is produced. The com- plexity of the LBG iterative optimization procedure and the algorithm sensitivity to the initial codebook are the basic disadvantages of the method that cannot guarantee global optimal solutions. There have been several attempts to in- troduce evolutionary VQ or clustering algorithms [2, 3, 4, 5, 6]. This work presents another VQ evolutionary alghorithm that adapts an integer-coded chromosome representation and uses a maximum likelihood (ML) fitness function. It im- plements problem-specific crossover and mutation operators that provide good exploration of the search space. Some modifications are made to the conventional ML function that proved to produce better codebooks. The basic objective of the proposed VQ algorithm is to find the best set of blocks that would represent the whole set of images with the best achievable quality for the given codebook size, W . Here, the codebook size is pre-defined, and the codewords are chosen from the training set of images. The algorithm has two out- comes, namely, the best found codebook, and the identified groups within the input images. The final codebook is then formed from the set of groups’ averages. In the VQ problem, each chromosome represents the set of codewords that forms a codebook. A chromosome is composed of a string of W integers, each representing a vector’s index in the input blocks. Order does not mat- ter, and repetition is not allowed. Selecting the codewords from the image makes use of the locality of reference, in addition to limiting the search space dramatically. It also allows using an integer-coded representation instead of the usually used real numbers representation that increases the computational complexity of the process. The recombina- tion process is carried out using the elite selection strategy employing a ranking selection function that is based on the normalized geometric distribution [7]. A variation of the single-point crossover operator is implemented here in such a way that the order of codewords is irrelevant, exchanging common genetic material between parents is of no value, children have to inherit their genetic material from both parents, and randomization makes it possible to produce different offsprings from the same parents, if the process is repeated, thus providing better exploration of the search space. The mutation operation randomly selects a number of genes in the parent and mutate their values to new ones 1937