Image Hub Explorer: Evaluating Representations and Metrics for Content-based Image Retrieval and Object Recognition Nenad Tomaˇ sev and Dunja Mladeni´ c Institute Joˇ zef Stefan Artificial Intelligence Laboratory Jamova 39, 1000 Ljubljana, Slovenia nenad.tomasev@ijs.si, dunja.mladenic@ijs.si Abstract. Large quantities of image data are generated daily and visualizing large image datasets is an important task. We present a novel tool for image data visualization and analysis, Image Hub Explorer. The integrated analytic func- tionality is centered around dealing with the recently described phenomenon of hubness and evaluating its impact on the image retrieval, recognition and rec- ommendation process. Hubness is reflected in that some images (hubs) end up being very frequently retrieved in ’top k’ result sets, regardless of their labels and target semantics. Image Hub Explorer offers many methods that help in visualiz- ing the influence of major image hubs, as well as state-of-the-art metric learning and hubness-aware classification methods that help in reducing the overall im- pact of extremely frequent neighbor points. The system also helps in visualizing both beneficial and detrimental visual words in individual images. Search func- tionality is supported, along with the recently developed hubness-aware result set re-ranking procedure. Keywords: image retrieval, object recognition, visualization, k-nearest neigh- bors, metric learning, re-ranking, hubs, high-dimensional data 1 Introduction Image Hub Explorer is the first image collection visualization tool aimed at understand- ing the underlying hubness [1] of the k-nearest neighbor data structure. Hubness is a recently described aspect of the well known curse of dimensionality that arises in vari- ous sorts of intrinsically high-dimensional data types, such as text [1], images [2] and audio [3]. Its implications were most thoroughly examined in the context of music re- trieval and recommendation [4]. Comparatively little attention was given to emerging hubs and the skewed distribution of influence in image data. One of the main goals of the Image Hub Explorer was to enable other researchers and practitioners to eas- ily detect hubs in their datasets, as well as test and apply the built-in state-of-the-art hubness-aware data mining methods.