SOM-ELM—Self-Organized Clustering using ELM Yoan Miche a,f , Anton Akusok b,n , David Veganzones c , Kaj-Mikael Björk d , Eric Séverin c , Philippe du Jardin e , Maite Termenon g,h , Amaury Lendasse b,d a Department of Information and Computer Science, Aalto University School of Science, FI-00076, Finland b Department of Mechanical and Industrial Engineering and the Iowa Informatics Initiative, The University of Iowa, Iowa City, IA 52242-1527, USA c University of Lille 1, IAE, 104 avenue du peuple Belge, 59043 Lille, France d Arcada University of Applied Sciences, 00550 Helsinki, Finland e EDHEC Business School, BP3116, 06202 Nice cedex 3, France f Nokia Solutions and Networks Group, Espoo, Finland g Inserm, U836, Grenoble F-38043, France h Univ. Grenoble Alpes, GIN, F-38000 Grenoble, France article info Article history: Received 28 September 2014 Received in revised form 24 February 2015 Accepted 6 March 2015 Communicated by G.-B. Huang Keywords: ELM Self-Organized SOM Clustering abstract This paper presents two new clustering techniques based on Extreme Learning Machine (ELM). These clustering techniques can incorporate a priori knowledge (of an expert) to define the optimal structure for the clusters, i.e. the number of points in each cluster. Using ELM, the first proposed clustering problem formulation can be rewritten as a Traveling Salesman Problem and solved by a heuristic optimization method. The second proposed clustering problem formulation includes both a priori knowledge and a self-organization based on a predefined map (or string). The clustering methods are successfully tested on 5 toy examples and 2 real datasets. & 2015 Elsevier B.V. All rights reserved. 1. Introduction Clustering is the general task of grouping similar objects together [1]. A unique solution does not always exist for any given input dataset [2], apart from the trivial assignment of all data samples to one cluster, or assigning one cluster per each sample. Clustering algorithms utilize different assumptions about the data: hierarchical clustering [3,4] groups nearby objects together, k- means [5,6] clustering techniques find dense clusters in a less dense space, and Expectation-Maximization (EM) algorithms [7] assumes that the distribution of samples in clusters can be approximated by multivariate Gaussian distribution(s) [8,9]. The number of clusters can be estimated automatically by some methods (like a density-based DBSCAN [10]), while in other methods, like k-means, it is a hyper-parameter optimized using a cost function. Not all the existing clusters in a dataset are easily separable, for instance the EM algorithm poorly distinguishes density-based clusters, and k-means algorithm tends to find clusters of similar size [11]; therefore the exact number of clusters may be complex to find a priori. There is a special class of clustering problems, where the desired cluster configuration is given in advance. The number of clusters and the number of samples in each cluster are known a priori (expert a priori knowledge), and the cluster assignment of each sample is to be found. This configuration helps solving tasks with highly uneven sizes of clusters. The solution is a mapping procedure between data samples and the desired clusters, which can be generalized to new samples and used for prediction as well. The problem of finding a mapping between a set of data samples and a set of “sample slots” in clusters (i.e. find which sample should be in which cluster) is an NP-hard set ordering problem. Heuristic methods are found to be suitable [12]. One requirement is a fast general cost function, which estimates how well the samples are mapped to the clusters. It should give a high value for distant samples mapped to the same cluster, and a low value for similar samples mapped together. An Extreme Learning Machine (ELM) is suitable for this task through its fast nonlinear cost function and will be exploited in our proposed method, described in Section 2. A modified version of this general cost function, accounting also for the self-organization property of the proposed SOM-ELM, is presented in Section 2.5. The topic of Self-Organizing Clustering [13–15] addressed in this paper is different from the well-known Self-Organizing Maps (SOM [16,17]) in the sense that the points in the original feature space are mapped to clusters, which act as the self-organizing units, whereas the clusters do not have representatives used to Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/neucom Neurocomputing http://dx.doi.org/10.1016/j.neucom.2015.03.014 0925-2312/& 2015 Elsevier B.V. All rights reserved. n Corresponding author. Please cite this article as: Y. Miche, et al., SOM-ELM—Self-Organized Clustering using ELM, Neurocomputing (2015), http://dx.doi.org/ 10.1016/j.neucom.2015.03.014i Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎