Indian Journal of Artificial Intelligence and Neural Networking (IJAINN)
ISSN: 2582-7626 (Online), Volume-1 Issue-2, April 2021
18
Published By:
Lattice Science Publication
© Copyright: All rights reserved.
Retrieval Number: 100.1/ijainn.B1019021221
DOI:10.35940/ijainn.B1019.041221
Journal Website: www.ijainn.latticescipub.com
Abstract: Phishing causes many problems in business industry.
The electronic commerce and electronic banking such as mobile
banking involves a number of online transaction. In such online
transactions, we have to discriminate features related to legitimate
and phishing websites in order to ensure security of the online
transaction. In this study, we have collected data form phish tank
public data repository and proposed K-Nearest Neighbors (KNN)
based model for phishing attack detection. The proposed model
detects phishing attack through URL classification. The
performance of the proposed model is tested empirically and result
is analyzed. Experimental result on test set reveals that the model
is efficient on phishing attack detection. Furthermore, the K value
that gives better accuracy is determined to achieve better
performance on phishing attack detection. Overall, the average
accuracy of the proposed model is 85.08%.
Keywords: Phishing attack, Machine learning, KNN Network
security, Phishing detection.
I. INTRODUCTION
Phishing is a type of social engineering attack where the
attacker attempts to gain access to a system by collecting
sensitive information such as user name, password and credit
card details [1]. The attacker uses forged websites to collect
the sensitive information from online users such as mobile
bank customers. Once, the sensitive information is gathered
though forged websites, then the attacker gets access to a
system and such access causes financial loss to the legitimate
users.
Numerous methods and countermeasures have been
proposed to safeguard users from phishing attack [2]. One of
phishing attack detection approaches is content-based
anti-phishing. Content-based anti-phishing approach is visual
similarity employed to identify the contents of phishing
websites form legitimate website by analyzing the similarity
of contents. Another approach to phishing attack detection is
classification of URL into malicious and legitimate classes by
employing machine-learning algorithm and proposing model
that automatically detects malicious URL and takes an action
to stop access to such URLs.
Although, different approaches have been proposed for
detecting phishing attack, phishing attack remained a major
Manuscript received on March 31, 2021.
Revised Manuscript received on April 05, 2021.
Manuscript published on April 10, 2021.
* Correspondence Author
Tsehay Admassu Assegie*, Department of Computer Science, Faculty
of Computing Technology, Aksum Institute of Technology, Aksum
University, Axum, Ethiopia. Email: tsehayadmassu2006@gmail.com
© The Authors. Published by Lattice Science Publication (LSP). This is an
open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/ )
challenge to the business industry involving online
transaction. One of the major challenges of phishing attack
detection is that attackers are adapted to different phishing
attack detection approaches. Consequently, an effective and
efficient phishing detection approach is important to tackle
the problem of phishing attack. Hence, we are motivated in
designing and implementing an alternative approach to detect
phishing attack more efficiently with machine learning.
Overall, the contribution of this study is to provide an
effective model to phishing attack detection by proposing
machine-learning model that classifies URL into phishing or
legitimate classes automatically. Therefore, the objectives of
this study is to explore the answers to the following questions:
How to create a classification model by employing KNN
algorithm to detect phishing attack?
What is the accuracy of KNN algorithm on phishing attack
detection?
What is the value of K that gives optimal accuracy on
phishing attack detection?
II. LITERATURE REVIEW
Numerous studies has been conducted on phishing attack
detection problem and a number of approaches have been
proposed although phishing attack detection still remains a
major challenge and much work is required to overcome the
challenges of phishing attack. This section presents a review
of recently published studies related to phishing attack
detection using automated predictive model for decision
support on identifying whether a given URL is suspicious or
legitimate.In [3], the authors proposed decision tree based
model for phishing attack detection. The authors employed
11,055 observation of suspicious or phishing and
nonsuspicious or legitimate websites with 30 features. The
performance of the proposed model is evaluated on test set
and result shows that decision tree algorithm is effective on
phishing attack detection, although the accuracy is promising,
there is still larger scope for improving the performance to get
better results.In another study [4], conducted on phishing
attack detection problem, random forest algorithm is
employed to phish tank dataset and model for phishing attack
detection is proposed to classify RUL into malicious or
phishing and legitimate classes. The performance of the
proposed is evaluated and the result shows that the proposed
model has acceptable accuracy on phishing attack detection.
In [5], convolutional neural network based intelligent
phishing attack detection model is proposed.
K-Nearest Neighbor Based URL Identification
Model for Phishing Attack Detection
Tsehay Admassu Assegie