Soft Comput
DOI 10.1007/s00500-017-2801-6
METHODOLOGIES AND APPLICATION
Collaborative multi-view K-means clustering
Safa Bettoumi
1
· Chiraz Jlassi
1
· Najet Arous
1
© Springer-Verlag GmbH Germany 2017
Abstract Due to the huge diversity and heterogeneity of
data coming from websites and new technologies, data con-
tents can be better represented by multiple representations
for taking advantage of their complementary characteris-
tics efficiently. This paper presents and discusses a new
approach for collaborative multi-view clustering based on
K-means hypothesis but modified in different ways. Our
solution seeks to find a consensus solution from multiple
representations by exploiting information from each of them
to improve the performance of classical clustering system. To
exhibit its effectiveness, the proposed approach is evaluated
on two image datasets having different sizes and features. The
obtained results reconfirm that multi-view clustering gives
performant results and shows that our proposal outperforms
mono-view clustering and also several other algorithms in the
literature in terms of accuracy, purity and normalized mutual
information.
Keywords Multi-view clustering · K-means clustering ·
Collaborative clustering
1 Introduction
With the large amount of information, data are more and more
disorganized and collected from heterogeneous sources of
Communicated by V. Loia.
B Safa Bettoumi
safa.bettoumi@gmail.com
1
LR-SITI-ENIT (Signal, Images et Technologies de
l’information), Ecole Nationale d’Ingnieurs de Tunis, BP-37,
Campus Univesitaire, 1002 Tunis, Tunisia
information. Thus, new challenges are involved around the
clustering problem. It becomes possible and common to have
multiple views from the same set of individuals. The possi-
bility of having several views naturally offers the possibility
of multiple clustering specific to each of them. This leads to
several useful combinations of clustering for a wide interpre-
tation. Furthermore, the clustering performance can be more
accurate by analyzing the affluent information of different
views. So, all performances of those clustering results will
have to be taken into account to enrich the clustering building
process.
The multi-view clustering can be found in various disci-
plines: economical, social and scientific domains. Recently,
results show that multi-view clustering yields more efficient
results than mono-view clustering because it represents an
additional way to successfully identify good clusters that
becomes an asset to have several sources of information. So,
a huge amount of results are analyzed to achieve a desired
clustering.
In scientific communities, multi-view clustering problem
is strongly related with the constraint of consensus between
views and how to integrate them to conduct the clustering
processes to an overall solution. In this context, it is appro-
priate to combine clustering results of the same individuals
to find a unique clustering by the intermediate of a fusion
process to get more confidence in the obtained clusters and
reduce the conflict between views.
Recently, several fusion approaches of views in the clus-
tering process have been proposed. The fusion treatment can
be investigated differently relative to the application objec-
tive. In general, fusion approaches can be categorized as
priori fusion method, posteriori fusion method and a central-
ized fusion method, based on their corresponding position
from the clustering process.
123