International Journal of Electrical and Computer Engineering (IJECE)
Vol. 6, No. 3, June 2016, pp. 995 ~ 1001
ISSN: 2088-8708, DOI: 10.11591/ijece.v6i3.9878 995
Journal homepage: http://iaesjournal.com/online/index.php/IJECE
Automatic Detection of Illegitimate Websites with Mutual
Clustering
K. Kanaka Durga, V. Rama Krishna
Dept of CSE, K L University, Guntur, AP
Article Info ABSTRACT
Article history:
Received Jan 6, 2016
Revised Mar 10, 2016
Accepted Mar 25, 2016
In the websites the contents will be are similarity when we compared with
other search engines. So to check the similar content in the websites and its
web contents we created a overhead to the search engine which will severely
effect its performance & quality. So to detect the silmilar or same content or
web documenattion some techniques are implemented by web crawling
research community. So it is one of major factor for the search engines to
provide some applicatory data to users in the first page itself. So to avoid
such issues we proposed a methodlogy called Automatic Detection of
illegitimate websites with Mutual Clustering (ADIWMC) paper we are
presenting a peculiar and efficacious path for the detection of similarities in
the web pages in web clustering. Detection of same and similar web pages
and web content will be done by storing the crawled web pages into
depository. Initially the adwords will be extracted from the crawled pages
and similarity checking will be done between the two pages based in the
usage of adwords. So a threshold value is set for this, if the similarity
checking percentage is greater than the threshold then similarity content is
reduced and improves the depositary and improves the search engine quality.
In the sections of existing analysis and the proposed analysis we are clearly
exploring how it works.
Keyword:
Illegitimate
Mutual Clustering
Phising
Web Crawled
Copyright © 2016 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
K. Kanaka Durga,
M. Tech Student, Dept of CSE,
K L University,
Vaddeswaram 522502, Guntur District, Andhra Pradesh, India.
1. INTRODUCTION
Large-scale and targeted attacks: Cybercriminals have to cheat online strategy adopted two familiar
consumer. Many scams are designed for large-scale success [1]. Phasing scams posing as banks and online
service providers by the thousand ray’s million spam messages fail fraction of users to a fake website penal
control [2]. In fact, many thieves are working somewhere in between, faithfully reproduce the logic of fraud,
without hardware to reproduce from previous versions of the attack. Thus, criminals engaged in advanced
banking fraud cost places exist for banks with online banking, which the victim has access to the inspection
of their 'deposits'. When a false bank is off, the criminals a new optimized from the old site. Criminals have
the fake escrow services as part of an advanced higher tax fraud. On the surface, escrow sites seem different,
but often share similarities in the text or HTML structure leg. Yet another example is online Ponzi 'high yield
investment programs (HYIP). The programs offer investors the extravagant interest, which means that
inevitably collapse when dry attract new deposits. The authors are behind the scenes as the creation of new
programs that often share similarities with previous versions [3]. The designers of these scams have a strong
incentive to distinguish from the old to keep their new copies. Potential victims may be afraid when they
realize that an earlier version of this site, "reported as fraudulent. So, criminals a concerted effort to
distinguish new copies of old.