Semi-supervised Corrupted Face Classification
via Graph Learning
Yisheng Zhong, Ao Li
{951208709@qq.com(Yisheng Zhong), dargonbuy@126.com(Ao Li)}
School of Computer Science and Technology, Harbin University of Science and Technology,
Harbin China
Abstract— Semi-supervised learning aims to training model with both of labeled and unlabeled data by
exploring the relationships among them. Graph-based semi-supervised learning is an classical representative
method that learning the class indicator matrix by propagating the similarity within the well designed graph
constructed by data. However, for face data, they often happen to pixel missing or occlusion, which will
degrade the graph learning performance, leading awful semi-supervised classification results. To address this
problem, a novel semi-supervised corrupted face classification method via graph learning is proposed, in
which the dynamic graph is learned by the completion face data recovered from the low-rank subspace. In
our proposed method, the robust data representation and graph learning are implemented alternatively to
obtain the overall optimal solutions. Experimental results demonstrate that our proposed method
outperforms comparison methods on both of classification accuracy and robustness.
Keywords: Semi-supervised face classification, Graph learning, Self- representation model, Low-rank constraint.
I. Introduction
In the face of massive high-dimensional data, how to conduct effective data analysis and processing
has become a major problem in machine learning and other fields [1]. In recent years, studying map
automatically is hot, which is one of the important methods for adaptive neighbor method. We construct
a matrix by setting each data point as the probability that the current data can be used as the neighborhood
of another data, and this probability is used as the similarity between the two data points [2], which does
not require similarity measures sensitive to noise and outliers [3], so the result obtained is of high
precision.
However, in the process of graph learning, there is noise or interference in the original data, so the
graph obtained may be inaccurate or suboptimal, and cannot accurately describe the true relationship
between the data. In order to solve this problem, Zhao Kang[4] proposed a new robust graph learning
scheme based on the adaptive neighbor method, which decomposes the original data into a low-rank
matrix D ("clean data") and a sparse matrix E ( "Noise / Error"). They can then use the adaptive neighbor
method to build graphs on clean data D. So they can remove image disturbances and learn to map at the
same time. However, when there is occlusion or partial absence of data, the results of this method are
not reliable or even the learning results are not available. The Laplace Score (LS) method proposed in
[5] introduced the analysis of the local structure of the data based on MaxVar. But these two methods
only consider the characteristics of the data itself, ignore the correlation between the characteristics, and
cannot guarantee the optimal feature subset. Inspired by the self-similarity of images, Zhu [6] believe
that images should not only have self-similarity in structure, but also have the ability to express
themselves in terms of feature expression. They proposed an unsupervised feature selection method
based on regularized self-representation by constructing a regularized self-representation (RSR) model.
This method constructs a self-representation model by assuming that each feature in the high-
dimensional data can be expressed as a linear combination of other features, and removes insignificant
features by adding ℓ
!,#
norm constraints to the feature's weight matrix W. We borrowed the method of
regularized self-representation to solve the problem of large-scale image interference or missing.
In the rest of this article, we will introduce graph learning and our multi-source robust graph
learning technique in the second section. Details of the algorithm are then given in Section III. The fourth
part evaluates the clustering task experimentally. The fifth part discusses semi-supervised applications
and compares the data recovery effect of the sixth part.
EAI MOBIMEDIA 2020, August 27-28, Harbin, People's Republic of China
Copyright © 2020 EAI
DOI 10.4108/eai.27-8-2020.2296556