International Journal of Pure and Applied Mathematics ————————————————————————– Volume 68 No. 2 2011, 233-252 A FRACTAL DIMENSION STANDPOINT TO THE CLUSTER VALIDATION PROBLEM Dvora Toledano-Kitai 1 § , Renata Avros 2 , Zeev Volkovich 3 1,2,3 Software Engineering Department ORT Braude College P.O. Box 78, Karmiel, 21982, ISRAEL 1 e-mail: dvora@braude.ac.il 2 e-mail: r avros@braude.ac.il 3 e-mail: vlvolkov@braude.ac.il Abstract: The article proposes a new standpoint to the cluster validation problem based on a fractal dimension cluster quality model. The suggested method uses the fractal property to describe cluster geometrical configuration. This notion is applied for further exploration of cluster validity, assuming that its low variability, calculated via different samples, can indicate stable parti- tions. In the framework of this model, the goodness of a partition is char- acterized by the quality of mixing two random samples within the partition’s clusters. It is implicitly assumed that the quality of a cluster is reflected by its fractal dimensionality estimated via different samples. Valid results are ob- tained by repeating these calculations on sufficiently large amount of samples drawn. Hence, empirical distributions of the absolute values of the fractal di- mension differences are constructed. The distribution most concentrated at the origin is proposed to indicate the true number of clusters. Numerical experi- ments are presented for various datasets. AMS Subject Classification: 28A80, 13F60 Key Words: cluster validation problem, fractal dimension cluster quality model, empirical distributions, fractal dimension, numerical experiments 1. Introduction Cluster analysis is an important tool in machine learning, typically employed in Received: January 20, 2011 c 2011 Academic Publications § Correspondence author