Novel Initialization Strategy for K-modes
Clustering Algorithm
Aizan Zafar and K. Swarupa Rani
Abstract K-modes is a method that is computationally efficient to cluster categorical
data. To perform a better clustering, an appropriate selection of initial modes is
required by K-modes clustering algorithm. Hence, selecting an initial mode for K-
mode clustering algorithm plays a crucial role in the performance of clustering.
Currently, for numerical data, different initialization methods exist, but we cannot
use those initialization methods for categorical data due to lack of geometry. In this
paper, we proposed and adopted density parameter-based method for selecting an
initial cluster mode integrated with the traditional K-modes clustering algorithm,
leading to more accurate cluster results. Our proposed method selects an initial
cluster mode so that it will be closer to the final cluster mode. The experiment has
been carried out on different benchmark datasets and compared with the traditional
methods.
Keywords K-mode clustering · Average similarity · Density parameter
1 Introduction
We are living in a world full of data. Storage of a huge amount of data and to represent
the data is a challenging task for analysis and management. One can either classify or
cluster these types of data to handle them effectively. Clustering is an unsupervised
learning method, this method partitions into groups. The philosophy of clustering
is to form groups based on similarity and dissimilarity measure for data objects
[1]. The reason for clustering is significant in numerous fields, for example, Image
Processing, Pattern Recognition, Data mining, Information Science, etc.
A. Zafar · K. Swarupa Rani (B )
School of Computer and Information Sciences, University of Hyderabad, Hyderabad 500046,
India
e-mail: kswarupaprasad@gmail.com
A. Zafar
e-mail: aizanzafar@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021
R. Patgiri et al. (eds.), Proceedings of International Conference on Big Data, Machine
Learning and Applications, Lecture Notes in Networks and Systems 180,
https://doi.org/10.1007/978-981-33-4788-5_8
89