Evolutionary feature weighting to improve the performance of multi-label lazy algorithms Oscar Reyes a , Carlos Morell b and Sebastián Ventura c,d,* a Computer Science Department, University of Holguín, Holguín, Cuba b Computer Science Department, Universidad Central de Las Villas, Santa Clara, Cuba c Department of Computer Science and Numerical Analysis, University of Córdoba, Córdoba, Spain d Information Systems Department, King Abdulaziz University, Jeddah, Saudi Arabia Abstract. In the last decade several modern applications where the examples belong to more than one label at a time have attracted the attention of research into machine learning. Several derivatives of the k-nearest neighbours classifier to deal with multi-label data have been proposed. A k-nearest neighbours classifier has a high dependency with respect to the definition of a distance function, which is used to retrieve the k-nearest neighbours in feature space. The dis- tance function is sensitive to irrelevant, redundant, and interacting or noise features that have a negative impact on the precision of the lazy algorithms. The performance of lazy algorithms can be significantly improved with the use of an appropriate weight vector, where a feature weight represents the ability of the feature to distinguish pattern classes. In this paper a filter-based feature weighting method to improve the performance of multi-label lazy algo- rithms is proposed. To learn the weights, an optimisation process of a metric is carried out as heuristic to estimate the feature weights. The experimental results on 21 multi-label datasets and 5 multi-label lazy algorithms confirm the ef- fectiveness of the feature weighting method proposed for a better multi-label lazy learning. Keywords: Feature weighting, lazy learning algorithms, multi-label classification, label ranking, learning metric, evolutionary algorithms * Corresponding author: Sebastián Ventura, Department of Computer Science and Numerical Analysis, University of Córdoba, Albert Ein- stein Building, Rabanales Campus, Córdoba, Spain. E-mail: sventura@uco.es, Phone: +34 957212218, Fax: +34 957218630. 1. Introduction In the last few decades, studies in the field of su- pervised learning have dealt with the analysis of data where the examples were associated with a sin- gle label [47, 49, 66]. However, there are several real problems where the examples belong to a set of labels at the same time, known as multi-label prob- lems [63]. In the last few years an increasing num- ber of modern applications that contain multi-label data have appeared, such as text categorisation [38], emotions evoked by music [35], semantic annotation of images [73] and videos [8], classification of pro- tein function and gene [76]. Several multi-label lazy algorithms derivate of the k- nearest neighbours (k-NN) classifier scheme have been proposed on the multi-label learning context [12, 56, 72, 74, 77]. In general, these algorithms do not construct a model from the training set, postpon- ing almost all the process until classification. They classify a query by retrieving its k-nearest neigh- bours in feature space and after that, an aggregation strategy is performed to predict the set of labels of a query instance [63]. In the same way of single-label k-NN classifier, the multi-label lazy algorithms have a high dependency with respect to the definition of a distance function that is used to determine the k- nearest neighbours of a query instance. The main disadvantage of the multi-label lazy algorithms is