International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 10, Volume 2 (October 2015) www.ijirae.com
_________________________________________________________________________________________________
© 2014-15, IJIRAE- All Rights Reserved Page -18
Privacy Preservation in Personalized Web Search
Mr. Arun Desai
*
Mr Pankaj Chandre
Computer Networks Computer Networks
Flora Institute of Technology, Pune Flora Institute of Technology, Pune
Abstract— In recent years, personalized web search (PWS) has demonstrated effectiveness in improving the
quality of search service on the Internet. Unfortunately, the need for collecting private information in PWS has
become a major barrier for its wide proliferation. However, evidences show that users’ reluctance to disclose their
private information during search has become a major barrier for the wide proliferation of PWS. We study privacy
protection in PWS applications that model user preferences as hierarchical user profiles. This paper proposed a PWS
framework called UPS that can adaptively generalize profiles by queries while respecting user specified privacy
requirements. This generalization aims at striking a balance between two predictive metrics that evaluate the utility of
personalization and the privacy risk of exposing the generalized profile.
Keywords—disruption-tolerant network (DTN); Ciphertext-policy attribute-based encryption (CP-ABE); attribute-
based encryption; secure data retrieval;
I. Introduction
Huge amount of information gets added to the Web every day. Publicly visible text creation is of the order of 10 GB per
day and private text creation (including user email, IM messages, tags, reviews etc) is of the order of 3 terabytes per day.
This rapidly increasing scale of the web is in many ways limiting the utility of the web. There is a high level of noise
beginning from spam and ending with a lot of uninteresting, irrelevant and duplicated content. Search engines and other
forms of ranking are unable to keep up with this. Recently, search engines have started showing Wikipedia links as the
top search result because ranking has become very hard.
Personalized search is a promising way to improve the accuracy of web search, and has been attracting much attention
recently. However, effective personalized search requires collecting and aggregating user information, which often raises
serious concerns of privacy infringement for many users. Indeed, these concerns have become one of the main barriers
for deploying personalized search applications, and how to do privacy-preserving personalization is a great challenge.
The web search engine has long become the most important portal for ordinary people looking for useful information on
the web. However, users might experience failure when search engines return irrelevant results that do not meet their real
intentions. Such irrelevance is largely due to the enormous variety of users’ contexts and backgrounds, as well as the
ambiguity of texts. Personalized web search (PWS) is a general category of search techniques aiming at providing better
search results, which are tailored for individual user needs. As the expense, user information has to be collected and
analyzed to figure out the user intention behind the issued query.
As the amount of information on the web continuously grows, it has become increasingly difficult for web search
engines to find information that satisfies users’ individual needs. Personalized search is a promising way to improve
search quality by customizing search results for people with different information goals. Many recent research efforts
have focused on this area. Most of them could be categorized into two general approaches: Re-ranking query results
returned by search engines locally using personal information; or sending personal information and queries together to
the search engine. A good personalization algorithm relies on rich user profiles and web corpus. However, as the web
corpus is on the server, re-ranking on the client side is bandwidth intensive because it requires a large number of search
results transmitted to the client before re-ranking. Alternatively, if the amount of information transmitted is limited
through filtering on the server side, it pins high hope on the existence of desired information among filtered results,
which is not always the case. Therefore, most of personalized search services online like Google Personalized Search and
Yahoo! My Web adopt the second approach to tailor results on the server by analyzing collected personal information,
e.g. personal interests, and search histories.
Nonetheless, this approach has privacy issues on exposing personal information to a public server. It usually requires
users to grant the server full access to their personal and behaviour information on the Internet. Without the user’s
permission, gleaning such information would violate an individual’s privacy.
Personalized web search is a promising technique to improve retrieval effectiveness. However, it often relies on
personal user profiles which may reveal sensitive personal information. The solutions to PWS can generally be
categorized into two types, namely click-log-based methods and profile-based ones. The click-log based methods are
straightforward they simply impose bias to clicked pages in the user’s query history. Although this strategy has been
demonstrated to perform consistently and considerably well, it can only work on repeated queries from the same user,
which is a strong limitation confining its applicability. In contrast, profile-based methods improve the search experience
with complicated user-interest models generated from user profiling techniques. Profile-based methods can be potentially
effective for almost all sorts of queries, but are reported to be unstable under some circumstances.