International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2458
Phishing Detection using Decision Tree Model
Aman Ahamed
1
, Dr. Ramananda Mallya K
2
, Anushri A Shetty
3
, Delisha DSouza
4
, Ashokkumar
Tirumala Gopi
5
1,3,4,5
Dept. of Information Science and Engineering, Mangalore Institute of Technology & Engineering, Moodbidri.
2
Associate Professor, Dept. of Information Science and Engineering, Mangalore Institute of Technology &
Engineering, Moodbidri.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In the modern days the security is the main
concern in this rapidly evolving world with the technology
advancement. There are many of the cases which led to huge
number of financial losses by common social attacks. These
attacks are the one that made technically or to the targeted
device. It's in the form of the virus or Trojan or it may be in
the form of a normal website link which we also called as
the URL (Uniform Resource Locator).These URLs contains
the software or the malicious program which takes out the
users all the valuable and more secured and private
information (or sensitive data) when this URL is entered by
the user in his remote machine. This form of attack is known
as Phishing. Normally the user will see the web page
appearing as a simple and interactive but in behind it is
more and more dangerous one. A fraudulent try made by the
attacker in order to steal the users data all the private
information like we have username, password, and private
details like users financial bank account and details of the
users credit card. To avoid these attacks there are many
advancements in artificial intelligence and machine
learning, which have efficient and more compact techniques
to find out the fake URLs. A machine learning model made
up of decision tree algorithm is developed which will scan
and filtes out the common words and learns the specific
features and then it will provide the appropriate result.
Key Words: Uniform Resource Locator, Decision Tree,
Security, Machine Learning
1. INTRODUCTION
Phishing in layman's terms is just giving the user by an
attacker the web link or we say it's a programmed URL or
abbreviated as Uniform Resource Locator where the term
programmed contains the scripts or the virus or malicious
infinite time running program or a zombie the process that
when invoked runs itself and it will do those tasks or the
commands ordered by the attacker.
This URL seems to be the normal one. But the attacker
uses this in order to get all the private and confidential
information from the user so that there is some benefit
enjoyed by the attacker. The domains are more. These
attacks majorly occur in the field of online payment sector,
web-based email, and in the cases of cloud storage [1]. 78
% of the attacks are made only in the domains like web-
based mailing systems in and online payments. The
remaining 22 % of the attacks are made for industrial
sectors.
The consequences and the results when phishing attacks
occur will cause huge financial losses in the case of the
banking domain. The current era internet revolution has
increasing and the advancement in technologies is also
increasingly growing, it has become an attractive place for
all potential users. Phishing is normally imitated by
mimicking as a trustworthy person or an entity on the
Internet which is done by integrating both social
engineering and technological tricks.
Lastly, we know that economic and financial helpers such
as banks are now becoming more important on the
Internet thereby making people's lives in this world easy.
Security and the safety of the people against these frauds
are mandatory in this digital era. Phishing is a major attack
or threat when it comes to securing the website.
There are mainly two types of phishing attacks one is
called the Spear phishing, which means targeting the
specific and private/public companies and the individual
people. The other one is called Clone phishing. This means
that this is an attack where the real or the original mail
containing an additional attachment or the URL/link is
copied to a fresh (new) mail with malicious attachment or
URL [2].
2. BACKGROUND
The main goal to achieve successful phishing is the user's
data, assets, or private information that is stolen through a
fake website [3]. If we detect bad URLs in the early stage
this is the best strategy to avoid contact with phishing
websites. Phishing websites are to be determined through
their basic domains [4].
These are related to the URL that needs to be registered.
We will implement machine learning algorithms to classify
the data in this case. The basic algorithms used here are as
follows. The proposed technique gives 95% accuracy. This
mainly depends on the quantity of data set divided into
training and testing.