International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249-8958 (Online), Volume-9 Issue-6, August 2020 308 Retrieval Number: F1442089620/2020©BEIESP DOI: 10.35940/ijeat.F1442.089620 Journal Website: www.ijeat.org Published By: Blue Eyes Intelligence Engineering and Sciences Publication © Copyright: All rights reserved. Abstract: “Learning-to-rank” or LTR utilizes machine learning technologies to optimally combine many features to solve the problem of ranking. Web search is one of the prominent applications of LTR. To improve the ranking of webpages, multimodality based Learning to Rank model is proposed and implemented. Multimodality is the fusion or the process of integrating multiple unimodal representations into one compact representation. The main problem with the web search is that the links that appear on the top of the search list may be either irrelevant or less relevant to the user than the one appearing at a lower rank. Researches have proven that a multimodality based search would improve the rank list populated. The multiple modalities considered here are the text on a webpage as well as the images on a webpage. The textual features of the webpages are extracted from the LETOR dataset and the image features of the webpages are extracted from the images inside the webpages using the concept of transfer learning. VGG-16 model, pre-trained on ImageNet is used as the image feature extractor. The baseline model which is trained only using textual features is compared against the multimodal LTR. The multimodal LTR which integrates the visual and textual features shows an improvement of 10-15% in web search accuracy. Keywords: Learning to Rank, LETOR, LTR, transfer learning. I. INTRODUCTION Learning to rank algorithm is a machine learning algorithm that ranks the documents automatically using an extracted feature set [4]. Learning-to-rank algorithms optimally combine features exacted from query-document pairs through discriminative training. It even can be used for rank aggregation like in the case of metasearch engines. Learning to rank becomes very useful in the case of search engines as daily they get a huge lot of training data in the form of user feedbacks and search logs. This can constantly improve their ranking mechanism. The ranking problems in IR could be tackled using two major approaches: the learning to rank (LTR) approach and the traditional approach (non-learning )such as BM25, language models, etc. The LTR model automatically learns the parameters of the ranking function by training whereas other methods heuristically determines the Revised Manuscript Received on August 06, 2020. * Correspondence Author Nikhila T Bhuvan*, Department of Information Technology, Rajagiri School of Engineering & Technology, Rajagiri Valley, Kakkanad, Kochi, Kerala, India. E-mail: nikhilatb@rajagiritech.edu.in M Sudheep Elayidom, Division of Computer Science, School of Engineering, CUSAT, Kerala, India. E-mail: sudheep@cusat.ac.in © The Authors. Published by Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ ) ranking function. The heuristic tuning will be feasible if the ranking model has only a few parameters. It becomes difficult for a non-learning approach to establish a ranking function incorporating all the feature values when the number of parameters is more. In contrast, multiple shreds of evidence can be of good use for the learning to rank approach. This paper focuses on supervised ranking on how to order the webpages of a web search query much more efficiently, using a ranked list of query document pairs as training and testing datasets. For each request qk, there is an associated set of webpages {Wk 1 ,Wk 2 , · · · ,Wkn} The main focus is on how to order these offerings in a user satisfying manner. In learning to rank method both query and webpages are represented as feature vectors or feature values. A query q and its associated webpage w can be represented by a feature vector x, where x = Φ(w, q). Φ is a feature extractor function based on BM25 or PageRank or frequencies of query terms in the webpage. The image feature is extracted through a deep learning model using a transfer learning technique from a pre-trained VGG-16 model. In earlier days, a probabilistic method was used to rank the documents, whereas now, it’s an automatic process of learning based on the training data. The training data that is provided to the learning model will be the feature vector of a web page. This is multimodal learning to rank model that uses images and textual features to rank the webpages. The feature from images and webpages are used to train the model which is used to re-rank the webpages. The visual features extracted from the images of user interest and 46 webpage textual features are considered as the feature vector to train the model. The work provides a relative ordering of webpages based of multiple modalities like text and images. The LETOR dataset is the collection of textual features of the webpages. The LETOR dataset is extended to add the image features of user interest extracting the image features using transfer learning. This extended feature vector is used to rank the webpages to provide a better Mean Average precision than the existing ranking algorithms like Ranknet, Adarank, etc, that are already implemented in the LETOR[5] dataset. The major contribution of the work to the research community can be stated as: Proposal to enhance the LETOR dataset using image features of a webpage in improving research in this area. Use of transfer learning to extract features of out of domain images and use these features along with the textual features to rank the web pages using deep neural networks. A Multimodal Learning to Rank model for Web Pages Nikhila T Bhuvan, M Sudheep Elayidom