International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-10, August 2019 258 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number I8157078919/2019©BEIESP DOI: 10.35940/ijitee.I8157.0881019 Abstract: Due to growth of multi-modal data, large amount of data is being generated. Nearest Neighbor (NN) search is used to retrieve information but it suffers when there is high-dimensional data. However Approximate Nearest Neighbor (ANN) is a searching method which is extensively used by the researchers where data is represented in form of binary code using semantic hashing. Such representation reduces the storage cost and retrieval speed. In addition, deep learning has shown good performance in information retrieval which efficiently handle scalability problem. The multi-modal data has different statistical properties so there is a need to have method which finds semantic correlation between them. In this paper, experiment is performed using correlation methods like CCA, KCCA and DCCA on NMIST dataset. NMIST dataset is multi-view dataset and result proves that DCCA outperforms over CCA and KCCA by learning representations with higher correlations. However, due to flexible requirements of users, cross-modal retrieval plays very important role which works across the modalities. Traditional cross-modal hashing techniques are based on the hand-crafted features. So performance is not satisfactory as feature learning and binary code generation is independent process. In addition, traditional cross-modal hashing techniques fail to bridge the heterogeneous gap over various modalities. So many deep-based cross-modal hashing techniques were proposed which improves the performance in comparison with non-deep cross-modal techniques. Inside the paper, we presented a comprehensive survey of hashing techniques which works across the modalities. Index Terms: Multi-Modal data, Deep CCA (DCCA), cross-modal retrieval, hashing. I. INTRODUCTION Due to advancement of World Wide Web, different types of data like text, images, audio, video are generated which is semantically consistent. Such data is called multi-modal data. As requirements of users are very flexible, need to develop a retrieval system which works across different modalities. Such retrieval is called cross-modal retrieval where users can give any modality as the input [1,3,6,21,27]. In addition, such retrieval provides complementary information which may be useful in decision making or in any recommendation system. Nearest Neighbor (NN) is widely used in information retrieval but very expensive as dimensionality increases. So researchers are focusing on Approximate Nearest Neighbor (ANN) which resolves the problem of NN by giving Revised Manuscript Received on August 05, 2019. Nikita Bhatt, U & P U Patel Department of Computer Engineering, CSPIT, CHARUSAT, Gujarat, India. E-mail: nikitabhatt.ce@charusat.ac.in Dr. Amit Ganatra, U & P U Patel Department of Computer Engineering, CSPIT, CHARUSAT, Gujarat, India. approximate solution [3, 4]. The indexing scheme called hashing of ANN is widely used which map high-dimensional data to binary code in comparison with tree-based indexing scheme [21, 27]. Such binary representation leads to less time and less space which helps for efficient retrieval [2,4,5]. In order to retrieve information across multi-modal data, multi-modal hashing (MMH) is used. MMH is categorized in two parts: Multi-Source Hashing (MSH) and Cross-Modal Hashing (CMH). But application scenario of MSH is limited in comparison with CMH as all modalities of data might not be present in practical scenario which is prerequisite for MSH. So CMH is used which explore the correlation among modalities to activate the cross-modal similarity search [11,18, 25, 26, 27]. Existing CMH strategies do feature learning and hash code learning as autonomous technique which may not achieve satisfactory result. But as the emerging technique called deep learning has shown promising result in feature generation, it is not only used as a feature extractor but also use as a hash code generator and it is done in single framework [2,3]. Remaining portion of the paper covers different cross-modal retrieval methods. However, it is required to find semantic similarity between multi-modal data for efficient retrieval as they have different statistical properties. There are many methods available in literature which finds semantic similarity between multi-modal data. Here experiment is performed using canonical correlation analysis (CCA), kernel canonical correlation analysis (KCCA) and deep canonical correlation analysis (DCCA). II. STUDY ON CROSS-MODAL RETRIEVAL METHODS Cross-Modal retrieval system is broadly divided into common subspace-learning and cross-modal hashing methods [31]. In common subspace-learning, different modalities are mapped to the common subspace which preserve the similarity across modalities. For faster retrieval common representation is mapped to binary code using hashing techniques. Remaining portion covers different methods for common subspace-learning and cross-modal hashing [31]. A. Common Subspace-Learning Submit your manuscript electronically for review. The subspace-learning technique learns a typical subspace in order to preserve the correlations among various modalities where the likeness will be directly calculated [10]. Figure 1 shows how common subspace is Semantic Correlation Based Deep Cross-Modal Hashing For Faster Retrieval Nikita Bhatt, Amit Ganatra