4OR
https://doi.org/10.1007/s10288-020-00445-y
RESEARCH PAPER
A competitive optimization approach for data clustering
and orthogonal non-negative matrix factorization
Ja’far Dehghanpour-Sahron
1
· Nezam Mahdavi-Amiri
1
Received: 3 December 2019 / Revised: 31 March 2020
© Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract
Partitioning a given data-set into subsets based on similarity among the data is called
clustering. Clustering is a major task in data mining and machine learning having
many applications such as text retrieval, pattern recognition, and web mining. Here, we
briefly review some clustering related problems (k -means, normalized k -cut, orthog-
onal non-negative matrix factorization, ONMF, and isoperimetry) and describe their
connections. We formulate the relaxed mean version of the isoperimetry problem as an
optimization problem with non-negative orthogonal constraints. We first make use of a
gradient-based optimization algorithm to solve this kind of a problem, and then apply
a post-processing technique to extract a solution of the clustering problem. Also, we
propose a simplified approach to improve upon solution of the 2-dimensional cluster-
ing problem, using the N -nearest neighbor graph. Inspired by this technique, we apply
a multilevel method for clustering a given data-set to reduce the size of the problem by
grouping a number of similar vertices. The number is determined based on two values,
namely, the maximum and the average of the edge weights of the vertices connected
to a selected vertex. In addition, using the connections between ONMF and k -means
and between k -means and the isoperimetry problem, we propose an algorithm to solve
the ONMF problem. A comparative performance analysis of our approach with other
related methods shows outperformance of our approach, in terms of the obtained mis-
classification error rate and Rand index, on both benchmark and randomly generated
problems as well as hard synthetic data-sets.
Keywords Clustering · Multilevel method · Normalized k -cut · Optimization
problem · Orthogonal non-negative matrix factorization
Mathematics Subject Classification 65K10 · 90C27
B Nezam Mahdavi-Amiri
nezamm@sharif.edu
Ja’far Dehghanpour-Sahron
jaafar.dehghanpour@yahoo.com
1
Faculty of Mathematical Sciences, Sharif University of Technology, P. O. Box 11155-9415,
Tehran, Iran
123