International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249-8958 (Online), Volume-9 Issue-6, August 2020
308
Retrieval Number: F1442089620/2020©BEIESP
DOI: 10.35940/ijeat.F1442.089620
Journal Website: www.ijeat.org
Published By:
Blue Eyes Intelligence Engineering
and Sciences Publication
© Copyright: All rights reserved.
Abstract: “Learning-to-rank” or LTR utilizes machine
learning technologies to optimally combine many features to solve
the problem of ranking. Web search is one of the prominent
applications of LTR. To improve the ranking of webpages,
multimodality based Learning to Rank model is proposed and
implemented. Multimodality is the fusion or the process of
integrating multiple unimodal representations into one compact
representation. The main problem with the web search is that the
links that appear on the top of the search list may be either
irrelevant or less relevant to the user than the one appearing at a
lower rank. Researches have proven that a multimodality based
search would improve the rank list populated. The multiple
modalities considered here are the text on a webpage as well as the
images on a webpage. The textual features of the webpages are
extracted from the LETOR dataset and the image features of the
webpages are extracted from the images inside the webpages
using the concept of transfer learning. VGG-16 model,
pre-trained on ImageNet is used as the image feature extractor.
The baseline model which is trained only using textual features is
compared against the multimodal LTR. The multimodal LTR
which integrates the visual and textual features shows an
improvement of 10-15% in web search accuracy.
Keywords: Learning to Rank, LETOR, LTR, transfer learning.
I. INTRODUCTION
Learning to rank algorithm is a machine learning algorithm
that ranks the documents automatically using an extracted
feature set [4]. Learning-to-rank algorithms optimally
combine features exacted from query-document pairs through
discriminative training. It even can be used for rank
aggregation like in the case of metasearch engines. Learning
to rank becomes very useful in the case of search engines as
daily they get a huge lot of training data in the form of user
feedbacks and search logs. This can constantly improve their
ranking mechanism. The ranking problems in IR could be
tackled using two major approaches: the learning to rank
(LTR) approach and the traditional approach (non-learning
)such as BM25, language models, etc. The LTR model
automatically learns the parameters of the ranking function by
training whereas other methods heuristically determines the
Revised Manuscript Received on August 06, 2020.
* Correspondence Author
Nikhila T Bhuvan*, Department of Information Technology, Rajagiri
School of Engineering & Technology, Rajagiri Valley, Kakkanad, Kochi,
Kerala, India. E-mail: nikhilatb@rajagiritech.edu.in
M Sudheep Elayidom, Division of Computer Science, School of
Engineering, CUSAT, Kerala, India. E-mail: sudheep@cusat.ac.in
© The Authors. Published by Blue Eyes Intelligence Engineering and
Sciences Publication (BEIESP). This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ )
ranking function. The heuristic tuning will be feasible if the
ranking model has only a few parameters. It becomes difficult
for a non-learning approach to establish a ranking function
incorporating all the feature values when the number of
parameters is more. In contrast, multiple shreds of evidence
can be of good use for the learning to rank approach. This
paper focuses on supervised ranking on how to order the
webpages of a web search query much more efficiently, using
a ranked list of query document pairs as training and testing
datasets. For each request qk, there is an associated set of
webpages {Wk
1
,Wk
2
, · · · ,Wkn} The main focus is on how to
order these offerings in a user satisfying manner. In learning
to rank method both query and webpages are represented as
feature vectors or feature values. A query q and its associated
webpage w can be represented by a feature vector x, where
x = Φ(w, q). Φ is a feature extractor function based on BM25
or PageRank or frequencies of query terms in the webpage.
The image feature is extracted through a deep learning model
using a transfer learning technique from a pre-trained
VGG-16 model.
In earlier days, a probabilistic method was used to rank the
documents, whereas now, it’s an automatic process of
learning based on the training data. The training data that is
provided to the learning model will be the feature vector of a
web page. This is multimodal learning to rank model that uses
images and textual features to rank the webpages. The feature
from images and webpages are used to train the model which
is used to re-rank the webpages. The visual features extracted
from the images of user interest and 46 webpage textual
features are considered as the feature vector to train the
model.
The work provides a relative ordering of webpages based
of multiple modalities like text and images. The LETOR
dataset is the collection of textual features of the webpages.
The LETOR dataset is extended to add the image features of
user interest extracting the image features using transfer
learning. This extended feature vector is used to rank the
webpages to provide a better Mean Average precision than
the existing ranking algorithms like Ranknet, Adarank, etc,
that are already implemented in the LETOR[5] dataset.
The major contribution of the work to the research
community can be stated as:
Proposal to enhance the LETOR dataset using image
features of a webpage in improving research in this
area.
Use of transfer learning to extract features of out of
domain images and use these features along with the
textual features to rank the web pages using deep
neural networks.
A Multimodal Learning to Rank model for Web
Pages
Nikhila T Bhuvan, M Sudheep Elayidom