Vol.:(0123456789) 1 3
Evolving Systems
https://doi.org/10.1007/s12530-019-09308-2
ORIGINAL PAPER
Efcient hybrid algorithms for density based subspace clustering
to deal with density divergence for improved quality and conciseness
B. Jaya Lakshmi
1
· K. B. Madhuri
1
· M. Shashi
2
Received: 7 March 2019 / Accepted: 12 October 2019
© Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract
Subspace clustering is the process of identifying clusters with objects similar in subsets of attributes defning subspaces. The
three major challenges faced by subspace clustering are: frstly, the subspace clustering algorithms explore exponential num-
ber of subspaces which possibly contain redundant clusters. This challenge is handled by a rough set based approach called
interesting subspace clustering, (ISC) algorithm that improves the efciency of the process by pre-pruning the uninteresting
subspaces and identify dense clusters only in interesting subspaces. Secondly, enormous number of subspace clusters are
generated which makes their interpretation difcult. This is addressed by a summarization algorithm, Similarity connected-
ness based Clustering on subspace Clusters, (SCoC) that generates compact set of high dimensional summarized subspace
clusters based on the novel concept of Similarity Connectedness. Finally, the problem of density divergence while forming
subspace clusters on diferent dimensionality is dealt successfully in subspace clustering with density variation algorithm so
as to produce high quality clusters using appropriate density thresholds based on the spread of the data in the given subspace.
The solutions for the above challenges proposed by authors are orthogonal to one another and hence, in this paper the authors
propose to hybridize them. The frst hybridization approach, Improved-ISC, achieves better quality subspace clusters and
efciency in exploration of subspaces. The second hybridization approach, Cascaded-SCoC algorithm, achieves compact
set of improved quality subspace clusters. Both the algorithms outperform the existing algorithms in terms of quality and
conciseness of the resulted clusters.
Keywords Density based subspace clustering · Density divergence · Summarization · Similarity–connectedness ·
Interesting subspace · Attribute dependencies
1 Introduction
Most of the real-world applications accumulate voluminous
data that provide ample opportunity for data analysts to
extract knowledge in support of decision making. Cluster
Analysis supports data analysis for activities like identifca-
tion of diferent market segments, document clustering at
various levels of granularity, etc. Clusters in full dimensional
space might not be interesting for all purposes since difer-
ent features contribute diferently to form clusters of objects
for varied purposes. In other words, for datasets with large
number of features, the cluster ability of objects changes
with diferent combinations of features defning subspaces.
Researchers of Subspace Clustering aim at fnding meaning-
ful clusters in subspaces formed by diferent combinations
of attributes with objects similar in diferent perspectives.
A subspace cluster is denoted with two dimensions 〈C,
A〉 representing the set of objects grouped into the cluster
and the set of attributes defning the subspace, respectively.
Subspace Clustering techniques are broadly classifed into
Grid-based and Density-based. Grid-based techniques parti-
tion the subspace into equal sized grid cells where the size
of the grid cell is user defned. The quality of the clusters
resulted by grid based methods is afected by size of the grid
cell and position of the grid cell in the feature space. Based
on the concept of density notion, the density-based tech-
niques discover arbitrary shaped dense clusters separated by
low density regions. Density based methods can also detect
outliers and flter them. The given cluster is extended to the
* B. Jaya Lakshmi
meet_jaya200@gvpce.ac.in
1
Department of IT, Gayatri Vidya Parishad College
of Engineering, Andhra Pradesh 530048, India
2
Department of CS and SE, AU College of Engineering (A),
Andhra Pradesh 530003, India