Prevalent Color Extraction and Indexing K.K.Thyagharajan #1 and R.I.Minu *2 # Dean(Academic), RMD Engineering College,India 1 kkthyagharajan@yahoo.com * Research Scholar, Dept. of Computer Science,Anna University,India 2 r_i_minu@yahoo.co.in Abstract:- Colors in an image provides tremendous amount of information. Using this color information images can be segmented, analyzed, labeled and indexed. In content based image retrieval system, color is one of the basic primitive features used. In Prevalent Color Extraction and indexing, the most extensive color on an image is identified and it is used for indexing. For implementation, Asteroideae flower family image dataset is used. It consist of more than 16,000 species, among them nearly 100 species are considered and indexed by dominating colors. To extract the most appealable color from the user defined images, the overall color of an image has to be quantized. Spatially, quantizing the color of an image to extract the prevalent color is the major objective of this paper. A combination of K-Mean and Expectation Minimization clustering algorithm called hidden-value learned K-mean clustering quantization algorithm is used to avoid the over clustering behavior of K-Mean algorithm. The experimental result shows the marginal differences between these algorithms. Keyword: Color Quantization, K-Mean, EM Algorithm, Asteroideae, RGB, HSV; I. INTRODUCTION An image can be represented either through global features or by local features. Extensively most of the image retrieval techniques such as CBIR [1][3] uses local features which are said to be the content of an image. These local Features are color, shape and texture of the images which were used to understand and identify an image.In Content based Image Retrieval systems, Color [1][2] is one of the essential low-level feature content used to index an image. A color is a parameter, which depends upon the frequency of light [12]. In digital image processing the colors are represented as a mathematical co-ordinates called as Color Models [14][20]. The commonly used color models are RGB, HSV, HSI, CMY and YCbCr models. Each model has its own characteristic, in this work of most prevalent color extraction the RGB and HSV mathematical models were used. Dominant color extraction is one of the most profound research areas. B.S. Manjunath et al [4], [5] provide an effective way of determining maximum of 8 dominant color from a local image, which has been used as MPEG 7’s [5] Dominant Color Descriptor (DCD) . In DCD the images are segmented into sub-region, the colors in these areas are quantized and color histograms were generated from these color bins of all the sub region the dominant colors were identified and were labeled uniquely. For those labeled colors the percentage, color variances and the spatial coherency are determined and the similarity between the colors of the pixels is identified using Euclidean distance. Using this DCD as one of the primitive low level feature [6][8][9][23] Image retrieval system were designed. In this work, to extract the most prevalent color from the image: First the image is quantized using hidden- value learned K-mean clustering algorithm (EMK), the quantized image is then converted into HSV color model and the pre-domination color from the image is extracted and that image is then indexed as per there extracted color. These steps are all explained with corresponding result analysis in the forthcoming sections. II. COLOR IMAGE QUANTIZATION The images would be of raw data. This image has to be pre-processed before performing any mathematical operation on it. To identify the most domination color from the image, the colors on the image has to be quantized to limited set of colors. Colors in the image can be quantized either by Scalar Quantization methods or Vector Quantization methods [13]. There are many color quantization technique [18], some of the standard techniques used for statistical analyze are illustrated below: A. Scalar Quantization In Scalar or Uniform quantization method just the RGB color values are sliced to fixed ranges mostly 64 and all the color pixel values in RGB are quantized between [0 – 64] instead of [0-255] . Here the spatial color distribution was not considered while using this uniform quantization methods, so it losses most of the essential information regarding the colors. K.K.Thyagharajan et.al / International Journal of Engineering and Technology (IJET) ISSN : 0975-4024 Vol 5 No 6 Dec 2013-Jan 2014 4841