A Parallel Implementation of Content‐Based Image Retrieval: Final Project Report 2008/12/18 Chunsheng Fang University of Cincinnati fangcg@email.uc.edu Ryan Anderson University of Cincinnati andersr9@email.uc.edu Abstract Content‐based image retrieval has many applications but remains a computationally intensive task. This is mainly due to the large size of an image database required for practical use. Our project aims to examine existing CBIR implementations and improve upon them using a parallel computing approach. Both parallelized offline feature extraction and online query process are implemented on the Beowulf cluster in Univ. of Cincinnati. Optimization and evaluation are also performed. 1. Introduction The area of interest we have chosen is that of Content‐Based Image retrieval (CBIR) [2]. An online demo for CBIR image search engine has already been developed by the authors in Univ. of Cincinnati, 2008 [4]. CBIR deals with comparing similarities between images based on the content of those images. The Image content is compared based on a feature vector that is extracted from each image. This feature vector is pre‐computed from the type of content to be extracted and compared (colors, textures, etc). Typically an image is queried against a training set or database of pre‐ computed feature vectors. The feature vector for the query image is computed and compared against the database. The top K stored images in database that are closest to the query images are then returned to the user as best matches. Worth mentioning, Google’s image search is text‐based, which requires manually labeled images for retrieval. The semantic gap between user retrieval demand and system retrieval results can be minimized by means of CBIR. Figure 0, VC‐bir image search engine in UC 1.2 Related work CBIR is very computation‐intensive due to several aspects, such as huge image database, offline feature extraction, online image retrieval, etc. Thus it is our hope to improve both the process of training and querying by parallelizing them. Our literature research found a paper by Lu and Yu (2007) [1]. They discuss a very similar approach to what we intend to take. In the