Vol.:(0123456789) 1 3 Evolving Systems https://doi.org/10.1007/s12530-019-09308-2 ORIGINAL PAPER Efcient hybrid algorithms for density based subspace clustering to deal with density divergence for improved quality and conciseness B. Jaya Lakshmi 1  · K. B. Madhuri 1  · M. Shashi 2 Received: 7 March 2019 / Accepted: 12 October 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019 Abstract Subspace clustering is the process of identifying clusters with objects similar in subsets of attributes defning subspaces. The three major challenges faced by subspace clustering are: frstly, the subspace clustering algorithms explore exponential num- ber of subspaces which possibly contain redundant clusters. This challenge is handled by a rough set based approach called interesting subspace clustering, (ISC) algorithm that improves the efciency of the process by pre-pruning the uninteresting subspaces and identify dense clusters only in interesting subspaces. Secondly, enormous number of subspace clusters are generated which makes their interpretation difcult. This is addressed by a summarization algorithm, Similarity connected- ness based Clustering on subspace Clusters, (SCoC) that generates compact set of high dimensional summarized subspace clusters based on the novel concept of Similarity Connectedness. Finally, the problem of density divergence while forming subspace clusters on diferent dimensionality is dealt successfully in subspace clustering with density variation algorithm so as to produce high quality clusters using appropriate density thresholds based on the spread of the data in the given subspace. The solutions for the above challenges proposed by authors are orthogonal to one another and hence, in this paper the authors propose to hybridize them. The frst hybridization approach, Improved-ISC, achieves better quality subspace clusters and efciency in exploration of subspaces. The second hybridization approach, Cascaded-SCoC algorithm, achieves compact set of improved quality subspace clusters. Both the algorithms outperform the existing algorithms in terms of quality and conciseness of the resulted clusters. Keywords Density based subspace clustering · Density divergence · Summarization · Similarity–connectedness · Interesting subspace · Attribute dependencies 1 Introduction Most of the real-world applications accumulate voluminous data that provide ample opportunity for data analysts to extract knowledge in support of decision making. Cluster Analysis supports data analysis for activities like identifca- tion of diferent market segments, document clustering at various levels of granularity, etc. Clusters in full dimensional space might not be interesting for all purposes since difer- ent features contribute diferently to form clusters of objects for varied purposes. In other words, for datasets with large number of features, the cluster ability of objects changes with diferent combinations of features defning subspaces. Researchers of Subspace Clustering aim at fnding meaning- ful clusters in subspaces formed by diferent combinations of attributes with objects similar in diferent perspectives. A subspace cluster is denoted with two dimensions C, Arepresenting the set of objects grouped into the cluster and the set of attributes defning the subspace, respectively. Subspace Clustering techniques are broadly classifed into Grid-based and Density-based. Grid-based techniques parti- tion the subspace into equal sized grid cells where the size of the grid cell is user defned. The quality of the clusters resulted by grid based methods is afected by size of the grid cell and position of the grid cell in the feature space. Based on the concept of density notion, the density-based tech- niques discover arbitrary shaped dense clusters separated by low density regions. Density based methods can also detect outliers and flter them. The given cluster is extended to the * B. Jaya Lakshmi meet_jaya200@gvpce.ac.in 1 Department of IT, Gayatri Vidya Parishad College of Engineering, Andhra Pradesh 530048, India 2 Department of CS and SE, AU College of Engineering (A), Andhra Pradesh 530003, India