An evolutionary voting for k-nearest neighbours Daniel Mateos-García, Jorge García-Gutiérrez, José C. Riquelme-Santos Keywords: Evolutionary computation Nearest-neigbour Weighted voting abstract This work presents an evolutionary approach to modify the voting system of the k-nearest neighbours (kNN) rule we called EvoNN. Our approach results in a real-valued vector which provides the optimal relative con-tribution of the k-nearest neighbours. We compare two possible versions of our algorithm. One of them (EvoNN1) introduces a constraint on the resulted real-valued vector where the greater value is assigned to the nearest neighbour. The second version (EvoNN2) does not include any particular constraint on the order of the weights. We compare both versions with classical kNN and 4 other weighted variants of the kNN on 48 datasets of the UCI repository. Results show that EvoNN1 outperforms EvoNN2 and statistically obtains better results than the rest of the compared methods. 1. Introduction Weighting in machine learning is a common technique for empha-sizing some characteristics of data to improve the resulting models. For example, weighting has been used to outline the importance of some particular instances (Blachnik & Duch, 2011) or features (Zhi, Fan, & Zhao, 2014), or rank a set of techniques in the context of en-sembles (Berikov, 2014). In a broad sense, Artiﬁcial Neural Networks (ANNs) and Support Vector Machines (SVMs) can be also seen as ex-amples of using weights in learning models but the k-nearest neigh-bours (kNN) has been the most common technique to beneﬁt from weights (Mateos-García, García- Gutiérrez, & Riquelme-Santos, 2012). kNN and its variants have been widely used in the literature to solve real problems. Rodger (2014) used a hybrid model to predict the demand of natural gas. The system was implemented integrating regression, fuzzy logic, nearest neighbour and neural networks, and considering several variables such as the price, operating expenses, cost to drill new wells, etc. If we focus on biological data, Park and Kim (2015) selected signiﬁcant genes from microarrays by using a nearest-neighbour-based ensemble of classiﬁers. On the other hand, Park, Park, Jung, and Lee (2015) tackled the problem of designing rec- ommender systems. For this purpose the authors presented Reversed CF (RCF), a fast item-based collaborative ﬁltering algorithm which utilizes a k-nearest neighbour graph. The main goal of a weighting system lies in the optimization (com- monly by metaheuristics) of a set of weights in the training step to ob- tain the highest accuracy but trying not to overﬁt the resulting model. If we focus on kNN weighting methods, many proposals weight- ing features or instances can be found. In Raymer, Punch, Goodman, Kuhn, and Jain (2000) a weighting method to obtain an optimal set of features was provided. The set of features was selected by means of a kNN-based genetic algorithm using a bit vector to indicate if a feature was in the selection or not. In a later work, the same authors presented a hybrid evolutionary algorithm using a Bayesian discrimi- nant function (Raymer, Doom, Kuhn, & Punch, 2003) and trying to iso- late characteristics belonging to large datasets of biomedical origin. Moreover, Paredes and Vidal (2006) used different similarity func- tions to improve the behaviour of the kNN. In a ﬁrst approximation, they considered a weight by feature and instance on training data re- sulting in a non-viable number of parameters in the learning process. Then, the authors proposed three types of reduction: a weight by class and feature (label dependency), a weight by prototype (proto- type dependency) and a combination of the previous ones. The opti- mization process was carried out by descendant gradient. In the same line, Tahir, Bouridane, and Kurugollu (2007) showed an approach that was able to both select and weight features simultaneously by using tabu search. Furthermore, Mohemmed and Zhang (2008) presented a nearest-centroid-based classiﬁer. This method calculated prototyp- ical instances by considering arithmetic average from the training data. To classify an instance, the method calculated the distance to every prototype and then selected the nearest one. The optimiza- tion of the best centroids that minimized the classiﬁcation error was carried out through particle swarm. Fernandez and Isasi (2008) also proposed a weighting system by using a prototype-based classiﬁer. After a data normalization that was based on the position of the