International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 09 | Sep-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 229
A comprehensive and heuristic approach for Personalized Web Search
using Greedy Algorithm
K.Saranya
M. Phil Scholar, School of Computer Science Engineering and Applications, Bharathidasan University,
Tiruchirappalli, Tamil Nadu, India, ksaranyakarunanithi.sk@gmail.com
---------------------------------------------------------------------***--------------------------------------------------------------
Abstract - Personalized web search (PWS) used for
developing the quality of various search services on the
Internet. Users might experience failure when search
engines return unrelated results that do not meet their
real intentions. Such irrelevance is largely due to the huge
variety of users’ contexts and backgrounds, as well as the
ambiguity of texts. However, evidences show that user’s
private information during search has become known to
publicly due to proliferation of PWS. We proposed a PWS
framework so-called UPS that can adaptively generalize
profiles by queries while respecting user specified privacy
requirements. This project presents two greedy
algorithms, namely Greedy DP and Greedy IL, for runtime
generalization. It also provides an online prediction
mechanism for determining whether personalizing a
query is beneficial. Rough set theory, which has been used
victoriously in solving problems in pattern recognition,
machine learning, and data mining, centers around the
idea that a set of individual objects may be approximated
via a lower and upper bound. In order to obtain the profits
that rough sets can provide for data mining and related
tasks, efficient computation of these approximations is
vital. Compared with the classic Set theory, Rough Set is a
mathematical approach to describe imprecision,
vagueness, and ambiguity in data analysis, and it was
earliest invented.
Key Words: Classifier Ensemble Selection, Rough
Sets, Feature Selection, Harmony Search, Fuzzy-
rough Sets.
1. INTRODUCTION
Recommender systems can use data mining techniques
in order to make recommendations using knowledge
learnt from the action and attributes of users. The main
aim of data mining is to discover new, interesting and
useful knowledge or information using a variety of
techniques such as prediction, classification, clustering,
association rule mining and sequential pattern
discovery. Currently, there is a rising interest in data
mining and educational systems, making educational
data mining a new and increasing research community.
The data mining approach to personalization uses all the
information about users/students which is available on
the web site (in the web course) in order to learn user
models and to make use of these models for
personalization. These systems can use different
recommendation techniques in order to recommend
online learning actions or optimal browsing pathways to
students, based on their preferences, knowledge and the
browsing history of other students with identical
characteristics.
Large amounts of data are generated every day
and the ability to analyses them is normally a challenge.
Experts need efficient data mining methods to extract
useful information and to perform the analysis of the
data. This is the case of the Rough Sets Theory (RST);
Pawlak introduced mathematical rough set theory in the
bit previous ͳͻͺͲ‟s. The theory was based on the
distinguishability of objects. Rough set theory affords
systems designers with the ability to handle ambiguity. If
a concept is „not definable‟ in a given information base,
rough sets can „approximate‟ with honor to that
knowledge. From a medical point of view, the attribute-
value boundaries are generally vague.
The rough set philosophy is established on the
assumption that with every item of the universe of
discourse we associate certain information (data,
knowledge). For example, if objects are patients
suffering from some disease, symptoms of the disease
form information about patients. Objects characterized
by the same information are indiscernible (similar) in
sight of the available information about them. The in
discernibility relation generated in this approach is the
mathematical basis of rough set theory. This
understanding of indiscernible is related to the concept
of Gottfried Wilhelm Leibniz that objects are
indiscernible if and only if all available functional take on
them identical values ȋLeibniz‟s Law of )ndiscernible
The Identity of Indiscernible). However, in the rough set
approach indiscernible is defined relative to a given set
of functional (attributes).
A weak aspect of RST is the unavailability of free
RST software, except for limited implementations. On the
other hand, there is RST proprietary software. RST is an
extension of the set theory and has the implicit feature of
compressing the dataset. Such compression is due to the
definition of sameness classes based on indiscernibility
relations and to the elimination of redundant or