The Randomized Approximating Graph Algorithm for Image Annotation Refinement Problem Yohan Jin University of Texas Dallas Multimedia Systems Lab. Richardson, Texas 75083-0688, USA yohan@utdallas.edu Kibum Jin Soongsil University Computer Institute Dongjak Gu, Sangdo-Dong 511 Seoul, Korea ckb@ssuci.ac.kr Latifur Khan University of Texas Dallas Data Mining Lab. Richardson, Texas 75083-0688, USA lkhan@utdallas.edu B.Prabhakaran University of Texas Dallas Multimedia Systems Lab. Richardson, Texas 75083-0688, USA praba@utdallas.edu Abstract Recently, images on the Web and personal computers are prevalent around the human’s life. To retrieve effectively those images, there are many AIA (Automatic Image Anno- tation) algorithms. However, it still suffers from low-level accuracy since it couldn’t overcome the semantic-gap be- tween low-level features (’color’,’texture’ and ’shape’) and high-level semantic meanings (e.g., ’sky’,’beach’). Namely, AIA techniques annotates images with many noisy key- words. Refinement process has been appeared in these days and it tries to remove noisy keywords by using Knowledge- base and boosting candidate keywords. Because of limitless of candidate keywords and the incorrectness of web-image textual descriptions, this is the time we need to have deter- ministic polynomial time algorithm. We show that finding optimal solution for removing noisy keywords in the graph is NP-Complete problem and propose new methodology for KBIAR (Knowledge Based Image Annotation Refinement) using the randomized approximation graph algorithm as the general deterministic polynomial time algorithm. 1. Introduction With the development of digital media and web- technologies, there has been appeared great number of content-based image retrieval (CBIR) researches in last few years such as Co-Occurrence Model [7], Translation Model [12], CRM(cross-media relevance model)[14] and so on. However, for visual similarity, CBIR rely on the low-level features (color histograms, textures, shapes and so on), which leaves a semantic gap between low-level visual fea- tures and semantic meaning of images. From this limit, CBIR research still far from reasonable accuracy level for commercial use (There are so many noisy keywords has been annotated along with correct ones). Actually, human understand images based on each person’s knowledge be- yond image itself. To improve the image annotation per- formance through imitating the way of human’s image un- derstanding, Yohan et al.[1] proposed the first approach for Knowledge-based Image Annotation Refinement (KBIAR) method. Among annotated keywords of each image, It re- fined image annotation results with removing noisy key- words and proposed semantic distances between annotated keywords (so called, ”candidate keywords”) for figuring out irrelevant candidate keywords. WordNet, a mirror of world- knowledge, has been used for getting semantic distances be- tween candidate keywords. Inspired by the Yohan et al’s idea, there has been several approaches appeared for refining automatic image annota- tion problem using the relationship between annotated key- words as ’candidate’ keywords by using semantic knowl- edge, so called KBIAR(Knowledge-Based Image Annota- tion Refinement) approaches as follows; [3] proposed adap- tive graphical model for refining process using fusing vi- sual content feature and keyword correlation and [2] done image annotation refinement by re-ranking the annotations using Random Walk with Restarts algorithm. [4] showed an approach for finding optimal subset annotation keywords of an image by using greedy heuristic solution. There are approaches [8][10] which apply refining methodology into 978-1-4244-2340-8/08/$25.00 ©2008 IEEE