Model-Based Detection of Salient Image Objects for Image Retrieval and Mining Roman M. Palenichka, Rokia Missaoui and Jean Vaillancourt Dept. of Computer Science and Engineering Université du Québec en Outaouais, Gatineau, Québec, Canada Abstract It is proposed to describe concisely image content by an ordered set of salient image objects for an effective implementation of image retrieval and mining tasks. This can be considered as an extension to the salient feature approach in image retrieval. A multi-scale morphological model is developed to define salient image objects in the image retrieval context. It is composed of two separate parts: morphological model for multi-scale planar shapes and intensity model with two dominant intensity levels. Based on this model, an object detection operator, called image relevance function, is developed. Positions of salient image objects correspond to locations of salient maxima of the relevance function. Each salient object is represented by a feature vector containing both local shape features and intensity (color) features. Image retrieval is based on establishing correspondence between two sets of salient objects (a query image and a database image). Image mining tasks in this context can be reduced to a construction of image models involving relations between salient image objects. The conducted experiments with synthetic and real images proved adequacy of the proposed model and sufficient accuracy of the salient object detection. 1. Introduction Feature extraction problem for image content representation is crucial in effective implementation of content-based image retrieval (CBIR) tasks [11, 14]. Image data mining (IDM) relies mostly on image features of higher, semantic levels and cannot be effective when using only low-level features such as the pixel values [4, 15]. A good feature set influences decisively on the adequacy of retrieval (i.e., correct image interpretation), provides stability (i.e., robustness) to distortions and invariance to geometrical transformations, and finally, can reduce substantially the retrieval time. The existing approaches to feature extraction for image retrieval can be roughly divided into two categories relatively to the image content description: computation of global features and computation of local features with their relationships. The first approach has obvious limitations in image retrieval since global features such as color histograms cannot capture all image fragments having different local characteristics. The semantic gap such as incorrect image interpretation still persists in many image retrieval and mining methods, especially when using global features of one type only [11, 14]. Another major concern with feature extraction is the invariance problem because geometrical transformations of images change substantially the feature values if the features are not invariant to such transformations. Therefore, local invariant features corresponding to salient image fragments will give adequate and sufficient description of the image content for retrieval or data mining purposes. Salient image fragments are the image locations with significant intensity contrast and particular shape characteristics such as isolated blob- like regions or corner fragments. The salient image fragments are relevant to image description and stable to intensity changes and some geometrical (view) transformations. This is a relatively new approach to image description in CBIR, which is referred to as an approach of salient image features [11, 14]. The approach of salient features for image analysis has been developed independently during many past years in order to perform effective and time- efficient search for objects of interest by attention focusing [5, 6, 16]. Relatively recently similar approach has been proposed by several researchers to cope with some problems of feature extraction in CBIR [8, 13,14]. However, some basic issues in the application of visual attention mechanisms and feature extraction still remain open. One major problem is the absence of an explicit measure for saliency of image fragments (local features) that impedes selection of image fragments for adequate description of image content. This problem is also related to the well-known semantic gap in image retrieval [4, 11]. The invariance to intensity changes and transformations of translation, scaling, and rotation is of great concern for robust image retrieval [1, 12]. The object detection method described in this paper is an attempt to obtain a concise image description based on the salient feature approach and eliminate as much as possible the existing drawbacks of feature extraction for CBIR and IDM. Initially, the approach of salient image objects was proposed recently to perform effectively some CBIR tasks [8].