IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 5, September 2010 ISSN (Online): 1694-0814 www.IJCSI.org 374 An Evolvable‐Clustering‐Based Algorithm to Learn Distance Function for Supervised Environment Zeinab Khorshidpour, Sattar Hashemi, Ali Hamzeh Dept. of Computer Science, Shiraz University, Iran Abstract This paper introduces a novel weight‐based approach to learn distance function to find the weights that induce a clustering by meeting best objective function. Our method combines clustering and evolutionary algorithms for learning weights of distance function. Evolutionary algorithms, are proved to be good techniques for finding optimal solutions in a large solution space and to be stable in the presence of noise. Experiments with UCI datasets show that employing EA to learn the distance function improves the accuracy of the popular nearest neighbor classifier. Keywords: distance function learning; evolutionary algorithm; clustering algorithm; nearest neighbor. 1 Introduction Almost all learning tasks, like case‐based reasoning [1], cluster analysis and nearest‐neighbor classification, mainly depend on assessing the similarity between objects. Unfortunately, however, defining object similarity measures is a difficult and non‐ trivial task, say, they are often sensitive to irrelevant, redundant, or noisy features. Many proposed methods attemp to reduce this sensitivity by parameterizing K‐NN's similarity function using feature weighting. The idea behind feature weighting is that real‐world applications involve with many features; however, the objective function depends on few of them. The presence of noisy objects or irrelevant features in a dataset degrades the performance of machine learning algorithms; for many cases, such in the case of k‐nearest neighbor machine learning algorithm (K‐NN). Thus feature weighting technique may improve the algorithm’s performance. This paper introduces a novel weight‐based distance function learning to find the weights that induce a clustering by meeting best objective function. In the recent years, different approach proposed for learning distance function from training objects. Stein and Niggemann use a neural network approach to learn weights of distance functions based on training objects [2]. Eick et.al introduce an approach to learn distance functions that maximizes the clustering of objects belonging to the same class [3]. Objects belonging to a dataset are clustered with respect to a given distance function and the local class density information of each cluster is then used by a weight adjustment heuristic to modify the distance function. Another approach, introduced by Kira and Rendell and Salzberg , relies on an interactive system architecture in which users are asked to rate a given similarity prediction, and then using a Reinforcement Learning(RL) based‐techniques to enhance the distance function based on the user feedback [4], [5]. Kononenko proposes an extension to the work by Kira and Rendell for updating attribute weights based on intracluster weights [6]. Bagherjerian et.al propose a reinforcement learning algorithm that can incorporate feedback and past experience to guide the search toward better cluster [7]. They use an adaptive clustering environment that modifies the weights of a distance function based on a feedback. The adaptive clustering