International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 2 Issue: 12 4205 - 4209 _______________________________________________________________________________________________ 4205 IJRITCC | December 2014, Available @ http://www.ijritcc.org _______________________________________________________________________________________ A Machine Learning Approach for Detection of Phished Websites Using Neural Networks Charmi J. Chandan, Hiral P. Chheda, Disha M. Gosar, Hetal R. Shah Prof. Uday Bhave. charmichandan@yahoo.in chheda.hiral@yahoo.com disha6gosar@gmail.com shahhetalnov@gmail.com Abstract:Phishing is a means of obtaining confidential information through fraudulent website that appear to be legitimate .On detection of all the criteria ambiguities and certain considerations involve hence neural network techniques are used to build an effective tool in identifying phished websites There are many phishing detection techniques available, but a central problem is that web browsers rely on a black list of known phishing website, but some phishing website has a lifespan as short as a few hours. These website with a shorter lifespan are known as zero day phishing website. Thus, a faster recognition system needs to be developed for the web browser to identify zero day phishing website. To develop a faster recognition system, a neural network technique is used which reduces the error and increases the performance. This paper describes a framework to better classify and predict the phishing sites. __________________________________________________*****_________________________________________________ I. INTRODUCTION Phishing is a type of online fraud in which a scamartist uses an e-mail or website to illicitly obtainconfidential information. It is a semantic attack which targets the user rather than the computer. It is arelatively new internet crime. The phishing problemis a hard problem because of the fact that it is veryeasy for an attacker to create an exact replica of agood banking website which looks very convincingto users. The communication (usually email) directsthe user to visit a web site where they are asked toupdate personal information, such as passwords andcredit card, social security, and bank accountnumbers that the legitimate organization already has. There are some characteristics in webpage source code that distinguish phishing website from legitimate website, so we can detect the phishing attacks by checking the webpage and by searching for these characteristics in the source code file if it exists. In this paper, we propose a Heuristic-based approach for phishing detection .This approach checks one or more characteristics of a website to detect phishing rather than look in a black list. These characteristics can be the uniform resource locater (URL), the hypertext mark-up language (HTML) code, or the page content itself. Most of the heuristics were targeted at the HTML source code. We extract some phishing characteristics and check each character in the webpage source code, if we find a phishing character; we will decrease from the initial secure weight. Finally we calculate the security rating based on the final weight, the lowest rated website indicates secure website and others indicates the website is most likely to be a phishing website. The goal of this project is to apply multilayer neural networks to phishing websites and evaluate the effectiveness of this approach. We design the feature set, process the phishing dataset, and implement the neural network systems. We then use cross validation to evaluate the performance of neural network with different numbers of hidden units and activation functions. We also compare the performance of neural network with other major machine learning algorithms. From the statistical analysis, we conclude that neural network with an appropriate number of hidden units can achieve satisfactory accuracy even when the training examples are scarce. Moreover, our feature selection is effective in capturing the characteristics of phishing websites, as most machine learning algorithms can yield reasonable results with it. II. PHISHING SCAMS There are many ways in which someone can usephishing to social engineer someone. For example, someone can manipulate a website address to make itlook like you are going to a legitimate website, whenin fact you are going to a website hosted by acriminal. The process of phishing involves five steps namely,planning, setup, attack, collection and identity theftand fraud. During the planning stage the phishersdecide which business to target and determine how to get e-mail addresses for the customers of that business. They often use the same mass-mailing and address collection techniques as spammers. In the setup stage after they know which business to spoof and who their victims are, the phishers create methods for delivering the message and collecting the data. Most often, this involves e-mail addresses and a web page. The attack stage i s the step people are most familiar with - the phisher sends a phony message that appears to be from a reputable source. The collection stage is the one in which phishers record the information entered by victims into Web pages or popup windows. The final stage is the Identity theft and Fraud where the phishers use the information they've gathered to make illegal purchases