© April 2022 | IJIRT | Volume 8 Issue 11 | ISSN: 2349-6002
IJIRT 154465 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 465
Review Spamicity approach by Resemblance Measure
Rajshree Patil
1
, Jyoti G. Biradar
2
1,2
Department of Computer Science, Government College (Autonomous), Kalaburgi
Abstract—The ubiquity of web2.0 makes the web an
invaluable source of information. For instance, product
reviews composed collaboratively by many independent
internet reviewers can help consumers make purchase
decisions and enable enterprises to improve their
business.In this work, an attempt is made to compare
and detect whether a review is spam or non-spam review
from different websites, in order to have a mechanism for
proper decision making or for marketing intelligence.
Keywords - Reviews, Feature extraction, Opinion
mining, Spam.
I.INTRODUCTION
Understanding the content of the review from the
reviewers relating to particular product, is the “key
concept” being expressed[4]. Locating the topic, main
idea, and supporting details helps one to understand
the point(s) or reviews. Identifying the relationship
between the reviewers and reviews will increase our
comprehension [11]. The web contains a wealth of
opinions about products, politicians, and more, which
are expressed in newsgroup, posts, review sites, and
elsewhere. As a result, the problem of opinion mining
has seen increasing attention over the last decade.
Product reviews on web sites such as amazon.com,
cnet.com and epinion.com and elsewhere often
associate meta-data with each review indicating how
positive (or negative) it is using a 5-star scale, and also
rank products by how they fare in the reviews at the
site[6].It is now a common practice for E-Commerce
web sites to enable their customers to write reviews of
products that they have purchased. The reviews are
then used by potential customers to find opinions of
existing users before purchasing the products[13].
They are also used by manufacturers to identify
problems in their products and/or to find competitive
intelligence information about their competitor [2][3].
The number of customer reviews that a product
receives is growing at a very fast rate. An important
issue related to the trustworthiness of online opinions
has been neglected most often. There is no reported
study on assessing the trustworthiness of reviews,
which is crucial for all opinion based applications,
although web spam and email spam have been
investigated extensively. Different websites provide
different formats for writing the reviews. There are
three different types of review formats available on the
web. Format (1) - Pros and Cons: The reviewer is to
describe Pros and Cons separately.Cnet.com uses this
format. Format (2) - Pros, Cons and detailed review:
The reviewer is to describe Pros and Cons separately
and write a detailed review, Epinions.com uses this
format. Format (3) - free format: The reviewer can
write freely, i.e., no separation of Pros and Cons,
Amazon.com uses this format. In this work, we aim to
summarize customer reviews of a product from
various websites like Cnet.com and Epinion.com, etc.,
for the same product.
II. RELATED WORK
In [15] it gives a web mining taxonomy but restricted
to web content and web usage mining and gives a
survey on web usage mining. It divides the web
content mining into the agent based approach and the
database approach. Most relevant work in review
mining is that of (Hu and Liu, 2004) [1]. At present
Opinion Mining has become a vital research subject in
the field of product reviews. [4]Although mining
opinions (positive and negative) from reviews became
a popular research topic in recent years [1,5] there is
still no reported study on review spam. A taxonomy of
Web spam is given in [5].Few researchers have studied
this problem [e.g., 1, 5, 6]. Review spam is very
different. Adding irrelevant words has little effect.
Instead, spammers write undeserving positive reviews
to promote some objects and/or malicious negative
reviews to damage the reputation of some other
objects. These false opinion spam reviews are very
hard to detect. Another related research is email spam
[7, 8], which is also quite different from review spam.
Email spam usually refers to unsolicited commercial
advertisements. Although exist, advertisements in
reviews are not as frequent as in emails. Recent studies