Locality Sensitive Hashing for Fast Computation of Correlational Manifold Learning based Feature space Transformations Vikrant Singh Tomar, Richard C. Rose Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada vikrant.tomar@mail.mcgill.ca, rose@ece.mcgill.ca Abstract Manifold learning based techniques have been found to be use- ful for feature space transformations and semi-supervised learn- ing in speech processing. However, the immense computational requirements in building neighborhood graphs have hindered the application of these techniques to large speech corpora. This paper presents an approach for fast computation of neighbor- hood graphs in the context of manifold learning. The approach, known as locality sensitive hashing (LSH), has been applied to a discriminative manifold learning based feature space trans- formation technique that utilizes a cosine-correlation based dis- tance measure. Performance is evaluated first in terms com- putational savings at a given level of ASR performance. The results demonstrate that LSH provides a factor of 9 reduction in the computational complexity with minimal impact on speech recognition performance. A study is also performed comparing the efficiency of the LSH algorithm presented here and other LSH approaches in identifying nearest neighbors. Index Terms: Locality sensitive hashing, correlation preserv- ing discriminant analysis, discriminative manifold learning 1. Introduction Manifold learning based feature space transformations assume that data points reside on or close to the surface of a lower di- mensional manifold. The techniques attempt to capture the un- derlying manifold based relationships among data vectors in or- der to find a target feature representation, where the underlying relationships between feature vectors are preserved [1–3]. It has been suggested that the acoustic feature space is confined to lie on one or more low dimensional manifolds [4, 5]. Therefore, a feature space transformation technique that explicitly models and preserves the local relationships of data along the underly- ing manifold should be more effective for speech processing. Multiple studies have demonstrated gains in automatic speech recognition (ASR) performance when using features de- rived from a manifold learning approach. Tang et. al. reported gains in ASR performance using features derived from locality preserving projections (LPP) [6]. In previous work [7, 8], the authors presented discriminative manifold learning techniques that led to significant improvements in ASR word error rates (WER) as compared to well-known techniques such as linear discriminant analysis (LDA) [9, 10] and LPP [1, 6]. However, despite having shown significant improvements in ASR perfor- mance on some tasks, manifold learning based algorithms have yet to find widespread usage in speech processing. This lack of acceptance can be credited to the high computational complex- ity and noise sensitivity of these algorithms [1, 3, 6, 11, 12]. This work is supported by Natural Sciences and Engineering Re- search Council of Canada, and McGill University. The computational complexity of manifold learning tech- niques originates from the need to construct nearest neighbor- hood based relationships. Typically, a pair-wise distances mea- sure is used. For a dataset containing N feature vectors of d dimensionality each, the construction of nearest neighborhood based graphs would require computational time amounting to O(dN 2 ). Speech datasets typically have hundreds of millions of feature vectors each having dimensions in the range of 100- 200. For these datasets, using an algorithm with O(dN 2 ) can be computationally infeasible. Though, there exist a number of algorithms that allow for faster neighborhood calculations such as kd-trees, many of these algorithms reach the complexity of linear search as the dimensionality of data increases [13]. This work investigates a fast algorithm for neighborhood calculations, locality sensitive hashing (LSH) [14–16], as ap- plied to manifold learning based correlation preserving discrim- inant analysis (CPDA) in ASR [8]. The algorithm acts by creat- ing hashed signatures of feature vectors for distributing vectors into a number of discrete buckets. The underlying concept is that vectors with strong correlation are more likely to fall into the same bucket. It is shown that LSH can drastically reduce the computational complexity of manifold learning algorithms. In this work, LSH is shown to provide a factor of 10 speedup without significant impact on the ASR performance. LSH is incorporated within the CPDA framework for fast computation of neighborhood graphs. CPDA is a supervised discriminative manifold learning algorithm that attempts to pre- serve the underlying local sub-manifold based relationships of feature vectors while at the same time tries to maximize a cri- terion related to the separability between classes of vectors. The algorithm utilizes a cosine-correlation based distance mea- sure instead of the conventional Euclidean measures. The use of a cosine-correlation based measure is motivated by studies suggesting that additive noise in linear spectrum domain alters the norm of cepstrum feature vectors [17], and that the an- gles between cepstrum vectors are comparatively more robust to noise [18]. Thus, the techniques that use a correlation based distance measure for characterizing the relationships between features are less susceptible to ambient noise compared to tech- niques that use an Euclidean measure. Accordingly, CPDA has demonstrated significantly improved ASR performance in noisy environments [8]. Correlation preservation based techniques have also been used in other application domains [11, 19]. 2. Correlation Preserving Discriminant Analysis This section summarizes the CPDA algorithm. A more detailed discussion can be found in [8]. CPDA is a discriminative mani- fold learning technique that attempts to maximize class separa- Copyright 2013 ISCA 25 - 29 August 2013, Lyon, France INTERSPEECH 2013 1776 10.21437/Interspeech.2013-440