Typicality Ranking of Images Using the Aspect Model Taro Tezuka and Akira Maeda College of Information Science and Engineering Ritsumeikan University {tezuka, amaeda}@media.ritsumei.ac.jp Abstract. Searching images from the World Wide Web in order to know what an object looks like is a very common task. The best response for such a task is to present the most typical image of the object. Existing web-based image search engines, however, return many results that are not typical. In this paper, we pro- pose a method for obtaining typical images through estimating parameters of a generative model. Specifically, we assume that typicality is represented by com- binations of symbolic features, and express it using the aspect model, which is a generative model with discrete latent and observable variables. Symbolic features used in our implementation are the existences of specific colors in the object re- gion of the image. The estimated latent variables are filtered and the one that best expresses typicality is selected. Based on the proposed method, we implemented a system that ranks the images in the order of typicality. Experiments showed the effectiveness of our method. Keywords: Image retrieval, Typicality, Bag-of-features, Generative model 1 Introduction One important use of web-based image search is to know the visual characteristics of an object. In such a case, what the user wants is the most “typical” look of the object. In existing web image search engines, however, the set of high ranked search results contain images that are not typical. The goal of this paper is to propose and evaluate a method that extracts typical images from the result of web image search by applying a generative model, a type of probabilistic model. Although typicality is a difficult concept to capture, but in this paper we define it as follows: Definition: An image I is a typical image for query Q if the word Q is an appropriate label for I , given that the evaluator has enough knowledge on the object referred by Q. Our proposed method estimates “aspects” expressed in a set of images, and select an aspect assumed to express typicality. We then rank images using conditional prob- ability. One of the characteristics of our method is that it expresses typicality using discrete probabilistic variable. Many models for classification and dimension reduction use continuous variables, including k-means and PCA (principal component analysis). Our model consists of discrete variables only. In this sense it is an intrinsically symbolic approach. The method can be used to obtain a large set of images with labels. The set