A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland {samuel.kaski,janne.sinkkonen}@hut.fi Abstract. We introduce a new mapping model from a latent grid to the input spa- ce. The mapping preserves the topography but measures local distances in terms of auxiliary data that implicitly conveys information about the relevance or im- portance of local directions in the primary data space. Soft clusters corresponding to the map grid locations are defined into the primary data space, and a distor- tion measure is minimized for paired samples of primary and auxiliary data. The Kullback-Leibler divergence-based distortion is measured between the conditional distributions of the auxiliary data given the primary data, and the model is op- timized with stochastic approximation yielding an algorithm that resembles the Self-Organizing Map, but in which distances are computed by taking into account the (local) relevance of directions. 1 Introduction Topograhy-preserving latent variable models like the Self-Organizing Map (SOM) [2,3] are valuable tools especially for descriptive data analysis tasks, creating overviews of the data. Such models form an organized mapping from the latent space, usually a two-dimensional discrete map grid, to the input space. The map grid can be used as a graphical display whereon close-by locations represent similar data. Additional properties of the data, such as the density (cluster) structure and the distribution of the values of data variables, can be visualized on the display. The mapping characterizes the probability density p(x) of the multivari- ate (vectorial) data x, but it depends on the metric of the data space. The metric in turn depends on the feature extraction: it is usually selected by first choosing the data variables and their relative scales, and then using a simple global measure such as the Euclidean distance. Feature extraction is often far from trivial. The variables may be of diverse nature, have different units of measurement, and their relative importance may be unknown. Moreover, the relative importance may be different in different locations of the data space. We have earlier studied methods for learning suitable local distance mea- sures. The task is impossible unless more information is brought to the setup. Our assumption has been that there is available some auxiliary data c whose © 2001 Springer-Verlag. Reprinted with permission from Allinson N., Yin H., Allinson L. and Slack J. (editors), Advances in Self-Organizing Maps. Springer-Verlag, London, pages 224-229.