© April 2022 | IJIRT | Volume 8 Issue 11 | ISSN: 2349-6002 IJIRT 154465 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 465 Review Spamicity approach by Resemblance Measure Rajshree Patil 1 , Jyoti G. Biradar 2 1,2 Department of Computer Science, Government College (Autonomous), Kalaburgi AbstractThe ubiquity of web2.0 makes the web an invaluable source of information. For instance, product reviews composed collaboratively by many independent internet reviewers can help consumers make purchase decisions and enable enterprises to improve their business.In this work, an attempt is made to compare and detect whether a review is spam or non-spam review from different websites, in order to have a mechanism for proper decision making or for marketing intelligence. Keywords - Reviews, Feature extraction, Opinion mining, Spam. I.INTRODUCTION Understanding the content of the review from the reviewers relating to particular product, is the “key concept” being expressed[4]. Locating the topic, main idea, and supporting details helps one to understand the point(s) or reviews. Identifying the relationship between the reviewers and reviews will increase our comprehension [11]. The web contains a wealth of opinions about products, politicians, and more, which are expressed in newsgroup, posts, review sites, and elsewhere. As a result, the problem of opinion mining has seen increasing attention over the last decade. Product reviews on web sites such as amazon.com, cnet.com and epinion.com and elsewhere often associate meta-data with each review indicating how positive (or negative) it is using a 5-star scale, and also rank products by how they fare in the reviews at the site[6].It is now a common practice for E-Commerce web sites to enable their customers to write reviews of products that they have purchased. The reviews are then used by potential customers to find opinions of existing users before purchasing the products[13]. They are also used by manufacturers to identify problems in their products and/or to find competitive intelligence information about their competitor [2][3]. The number of customer reviews that a product receives is growing at a very fast rate. An important issue related to the trustworthiness of online opinions has been neglected most often. There is no reported study on assessing the trustworthiness of reviews, which is crucial for all opinion based applications, although web spam and email spam have been investigated extensively. Different websites provide different formats for writing the reviews. There are three different types of review formats available on the web. Format (1) - Pros and Cons: The reviewer is to describe Pros and Cons separately.Cnet.com uses this format. Format (2) - Pros, Cons and detailed review: The reviewer is to describe Pros and Cons separately and write a detailed review, Epinions.com uses this format. Format (3) - free format: The reviewer can write freely, i.e., no separation of Pros and Cons, Amazon.com uses this format. In this work, we aim to summarize customer reviews of a product from various websites like Cnet.com and Epinion.com, etc., for the same product. II. RELATED WORK In [15] it gives a web mining taxonomy but restricted to web content and web usage mining and gives a survey on web usage mining. It divides the web content mining into the agent based approach and the database approach. Most relevant work in review mining is that of (Hu and Liu, 2004) [1]. At present Opinion Mining has become a vital research subject in the field of product reviews. [4]Although mining opinions (positive and negative) from reviews became a popular research topic in recent years [1,5] there is still no reported study on review spam. A taxonomy of Web spam is given in [5].Few researchers have studied this problem [e.g., 1, 5, 6]. Review spam is very different. Adding irrelevant words has little effect. Instead, spammers write undeserving positive reviews to promote some objects and/or malicious negative reviews to damage the reputation of some other objects. These false opinion spam reviews are very hard to detect. Another related research is email spam [7, 8], which is also quite different from review spam. Email spam usually refers to unsolicited commercial advertisements. Although exist, advertisements in reviews are not as frequent as in emails. Recent studies