Using Pattern Search Methods for Minimizing Clustering Problems Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar Abstract—Clustering is one of an interesting data mining topics that can be applied in many ﬁelds. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex opti- mization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method. Keywords—Clustering functions, Non-smooth Optimization, Non- convex Optimization, Pattern Search Method. I. I NTRODUCTION C LUSTERING is the unsupervised classiﬁcation of pat- terns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reﬂects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difﬁcult problem combinatorially, and different approaches to this problem have been proposed and studied”[1]. In cluster analysis, we are given a ﬁnite set B of points in the d-dimensional space R d , that is B =  b 1 ,...,b n  , where b j ∈ R d ,j =1, ··· ,n. There are different types of clustering such as packing, partition, covering and hierarchical clustering [2]. In this paper we consider partition clustering, that is, the problem of distributing points of the set B into a given number k of separated subsets B i = φ with respect to predeﬁned criteria such that: (i) B i i =1, ··· ,k; (ii) B i ∩ B j = φ i, j =1, ··· ,k, i = j ; (iii) B =  k i=1 B i . The sets B i ,i =1, ··· ,k are called clusters. Suppose that each cluster B i ,i =1, ··· ,k can be rec- ognized by its center (or centroid) bc i ∈ R d ,i =1, ··· ,k. Then the clustering problem can be reduced to the following optimization problem [3],[4]: M. Hassan and L. June and P. Shabanzadeh are with the Institute for Mathematical Research (INSPEM), University Putra Malaysia (E-mail: malik@science.upm.edu.my, leongwj@putra.upm.edu.my, parvaneh.sh@inspem.upm.edu.my) M. Mohagheghtabar is with the Department of Mathematics, University Kebangsaan Malaysia. (E-mail: maryamohaghegh@gmail.com ) min ψ (bc, w)= 1 n n  j=1 k  i=1 w ij   bc i − b j   (1) S.t bc = ( bc 1 , ··· , bc k ) ∈ R d×k , ∑ k i=1 w ij =1, j =1, ··· , n, w ij =0 or 1, for j =1, ··· ,n, i =1, ··· ,k. where w ij is the association weight of pattern b j with cluster i, if pattern j is allocated to cluster i then w ij =1 otherwise w ij =0 and bc i = ∑ n j=1 wij b j ∑ n j=1 wij i =1, ··· ,k. Here ‖.‖ is an Euclidean norm and w is an n × k matrix; also problem (1) is also known as minimum sum-of-squares clustering problem. This problem is a global optimization problem. Therefore different algorithms of mathematical pro- gramming can be applied to solve this problem. In [1], [5] up- to-date and good review of these algorithms including dynamic programming, branch and bound, cutting planes and etc are presented. Also different heuristics can be used for solving large clustering problems and k-means is one such algorithm. This is a very fast and famous algorithm in clustering and it gives good results when there are few clusters but worsens when there are many [2], [6]. Much better results have been obtained with metaheuristics, such as genetic algorithm and tabu search, simulated annealing [7]. We continue the effort to ﬁnd a local optimizer for the clustering problem so that an algorithm for clustering based on non-smooth optimization techniques is developed in [5]. Here we introduce this algorithm, which calculates clusters step-by-step, gradually increasing the number of data clusters until stopping conditions are met. In this approach the clus- tering problem is moderated to an unconstrained optimization problem with non-smooth objective function. In the present article we have adapted and used pattern search method to solve this optimization problem and we are able to verify that pattern search methods have better performance than some of best known ways. Pattern search methods have been widely employed in many applications. This method needs an initial search that use heuristics to ﬁnd good initial points near some promising local minima before running the pattern search algorithm, where it is used as a local optimizer for the continuous variables. They generate signiﬁcantly fewer invalid solutions and also our numerical experiments show that pattern search methods are robust. In [8] an excellent introduction and survey of these methods can be found, which also contains numerous references. We will review these methods in more detail in section IV. World Academy of Science, Engineering and Technology International Journal of Mathematical and Computational Sciences Vol:4, No:2, 2010 325 International Scholarly and Scientific Research & Innovation 4(2) 2010 ISNI:0000000091950263 Open Science Index, Mathematical and Computational Sciences Vol:4, No:2, 2010 publications.waset.org/1125/pdf