Procedimientos de realimentaci´on basadosen m´ etodos estad´ ısticospara la recuperaci´on por contenido en basesde datos de im´agenes TIN2006-10134 * G. Ayala, X. Benavent, E. de Ves, J. Domingo, I. Epifanio, M.V. Iba˜ nez, T.Le´on,A.Sim´o February 4, 2009 Abstract The project is mainly concerning with content based image retrieval. The sections 1, 2, 3 and 4 contains the descriptions of the results achieved. The members of the group have published other papers not directly related with the project but, obviously, supported by it. The set of references generated from this project are [1, 8, 10, 9, 17, 13, 14, 18, 20, 15, 16, 4, 2, 12, 19, 3, 5, 7, 6]. 1 Applying logistic regression to relevance feed- back in image retrieval systems This project deals with the problem of image retrieval from large image databases. Basically, we are mainly concerned with relevance feedback. Relevance feedback is a term used to designate the actions performed by a user to interactively improve the results of a query by reformulating it. An initial query formulated by a user may not fully capture his/her wishes. This is due to several reasons: the complexity of formulating the query, lack of familiarity with the data collection procedures, or inadequacy of the available features. Users then typically change the query manually and re-execute the search until they are satisfied. By using relevance feedback, the system learns a new query that captures better the users’ need for information. A particularly interesting problem is the retrieval of all images which are similar to one in the user’s mind, taking into account his/her feedback which is expressed as positive or negative preferences for the images that the system progressively shows during the search. In the paper [20] we presented a novel algorithm for the incorporation of user preferences in an image retrieval system based exclusively on the visual content of the image, which is stored as a vector of low-level features. The algorithm considers the probability of an image belonging to the set of those sought by the user, and models the logit of this probability as the output of a linear model whose inputs are the low level image features. The image database is ranked by the output of the model and shown to the user, who selects a few positive and negative samples, repeating the process in an iterative way until he/she is satisfied. The * Ministerio de Educaci´ on y Ciencia. Direcci´ on General de Investigaci´ on. Programa Nacional de Tecnolog´ ıas de la Informaci´ on y las Comunicaciones. 1/10/2006-30/9/2009 1