LOCAL-DRIVEN SEMI-SUPERVISED LEARNING WITH MULTI-LABEL Teng Li 1 , Shuicheng Yan 2 ,Tao Mei 3 and In-So Kweon 1 1 Department of Electrical Engineering, Korea Advanced Institute of Science and Technology 2 Department of Electrical and Computer Engineering, National University of Singapore 3 Microsoft Research Asia, Beijing, P. R. China ABSTRACT In this paper, we present a local-driven semi-supervised learning framework to propagate the labels of the training data (with multi-label) to the unlabeled data. Instead of using each datum as a vertex of graph, we encode each extracted local feature descriptor as a vertex, and then the labels for each vertex from the training data are derived based on the context among different training data, finally the decomposed labels on each vertex are further propagated to the unlabeled vertices based on the similarities measured according to the features extracted at each local regions. With the learnt local descrip- tor graph we can predict the semantic labels for not only the test local features but also the test images. The experiments on multi-label image annotation demonstrate the encourag- ing results from our proposed framework of semi-supervised learning. Index TermsSemi-supervised Learning, Image Anno- tation, Local Features, Multi-Label Learning. 1. INTRODUCTION Semi-supervised learning is an important topic in image clas- sification which has attracted significant attention recently. It can leverage the unlabeled data in addition to the labeled data for the classification, therefore solve the problem of being lack of sufficient labeled data in many real applications. A lot of algorithms on semi-supervised learning have been proposed [1], among them graph-based methods are the main theme owing to their effectiveness and efficiency [2, 3, 4]. These methods construct the graph using both training and test samples and propagate the known labels to all the ver- texes based on certain assumptions formulated in a regular- ization framework. For example, the Gaussian Random Field (GRF) and harmonic function method defines a quadratic loss function with infinity weights to clamp the labeled examples, and formulates the regularizer based on the graph combinato- rial Laplacian [2]. Conventional semi-supervised learning methods mainly aim at the cases with single label for each datum. Recently, with the availability of multi-label image datasets, semi- supervised learning with multi-label has become an important WUDLQLQJ LPDJHV WHVW LPDJHV >UE@ >UE@ >E@ >U@ >UE@ >UE@ Fig. 1. An illustration of the proposed approach: vectors with “r” and “b” represent different labels of the images; red and blue circles denote the local descriptors of “r” and “b” respec- tively. For better view, please see the color pdf file. problem with many applicable scenarios. We can directly apply the typical graph-based learning to multi-label cases without considering the dependent relation between labels. Several algorithms have also been proposed to address the inherent correlations among multiple labels by adding a reg- ularizer in the semi-supervised learning framework or etc. [5, 6, 7]. They demonstrate the value of label correlations with promising results. All these methods model the semantic relation between images based on the global feature matching. An image is considered as a vertex linking with others in the graph. In this paper, we propose a novel local-driven semi-supervised learn- ing approach based on the local feature descriptor matching. Fig. 1 illustrates the main idea: local feature descriptors in the images are extracted to construct the graph and the edges are set up according to image matching criterions. Labels for the training vertexes are derived from the context of matching images with multi-labels. As shown in the figure, if a group of linked local feature descriptors have a common image-level label such as “r” or “b”, the vertexes are associated with the corresponding label. The local labels are propagated to ver- texes in the test images by graph based semi-supervised learn- 1508 978-1-4244-4291-1/09/$25.00 ©2009 IEEE ICME 2009 Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on January 14, 2010 at 05:25 from IEEE Xplore. Restrictions apply.