Journal of Machine Learning Research 11 (2010) 451-490 Submitted 4/09; Revised 12/09; Published 2/10 Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization Jarkko Venna JARKKO. VENNA@NUMOS. FI Jaakko Peltonen JAAKKO. PELTONEN@TKK. FI Kristian Nybo KRISTIAN. NYBO@TKK. FI Helena Aidos HELENA. AIDOS@TKK. FI Samuel Kaski SAMUEL. KASKI @TKK. FI Aalto University School of Science and Technology Department of Information and Computer Science P.O. Box 15400, FI-00076 Aalto, Finland Editor: Yoshua Bengio Abstract Nonlinear dimensionality reduction methods are often used to visualize high-dimensional data, al- though the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been well-defined. We give a rigorous definition for a specific visualization task, resulting in quantifiable goodness measures and new visualization methods. The task is information retrieval given the visualization: to find similar data based on the similarities shown on the display. The fundamental tradeoff be- tween precision and recall of information retrieval can then be quantified in visualizations as well. The user needs to give the relative cost of missing similar points vs. retrieving dissimilar points, after which the total cost can be measured. We then introduce a new method NeRV (neighbor retrieval visualizer) which produces an optimal visualization by minimizing the cost. We further derive a variant for supervised visualization; class information is taken rigorously into account when computing the similarity relationships. We show empirically that the unsupervised version outperforms existing unsupervised dimensionality reduction methods in the visualization task, and the supervised version outperforms existing supervised methods. Keywords: information retrieval, manifold learning, multidimensional scaling, nonlinear dimen- sionality reduction, visualization 1. Introduction Visualization of high-dimensional data sets is one of the traditional applications of nonlinear di- mensionality reduction methods. In high-dimensional data, such as experimental data where each dimension corresponds to a different measured variable, dependencies between different dimensions often restrict the data points to a manifold whose dimensionality is much lower than the dimension- ality of the data space. Many methods are designed for manifold learning, that is, to find and unfold the lower-dimensional manifold. There has been a research boom in manifold learning since 2000, and there now exist many methods that are known to unfold at least certain kinds of manifolds suc- cessfully. Some of the successful methods include isomap (Tenenbaum et al., 2000), locally linear embedding (LLE; Roweis and Saul, 2000), Laplacian eigenmap (LE; Belkin and Niyogi, 2002a), and maximum variance unfolding (MVU; Weinberger and Saul, 2006). c 2010 Jarkko Venna, Jaakko Peltonen, Kristian Nybo, Helena Aidos and Samuel Kaski.