PDSMV3-DCRNN: A novel ensemble deep learning framework for
enhancing phishing detection and URL extraction
Y. Bhanu Prasad
a,*
, Venkatesulu Dondeti
b
a
Department of Computer Science & Engineering, Vignan’s Foundation for Science, Technology and Research (VFSTR) (Deemed to be University), Vadlamudi, Guntur,
Andhra Pradesh, 522213, India
b
Department of Advanced Computer Science & Engineering, Vignan’s Foundation for Science, Technology and Research (VFSTR) (Deemed to be University), Vadlamudi,
Guntur, Andhra Pradesh, 522213, India
A R T I C L E INFO
Keywords:
Phishing
Uniform resources locator (URL)
Binary grey goose optimization algorithm
(BGGOA)
Pyramid depth wise separable-mobileNetV3
(PyDS-MV3)
Deformable convolutional residual neural
network (DCRNN)
Boosted ConvNeXt
ABSTRACT
Phishing is a cyber-attack that exploits victims’ technical ignorance or naivety and commonly involves a Uniform
Resources Locator (URL). As a result, it is beneficial to examine URLs before accessing them to spot a phishing
assault. Several algorithms based on machine learning have been presented to detect phishing attempts. How-
ever, these approaches often suffer from lower performance outcomes, such as lower accuracy, longer response
times, and higher false positive rates. Furthermore, many existing methods rely heavily on predefined feature
sets, which may limit their adaptability and robustness. In contrast, our proposed method leverages a more
dynamic feature selection process, which includes the Conditional Wasserstein Generative Adversarial Network
(CWGAN) for addressing data imbalance and the Binary Grey Goose Optimization Algorithm (BGGOA) for
optimal feature selection. This dynamic approach enhances the model’s ability to adapt to varying data char-
acteristics, improving detection performance. The proposed solution is divided into two stages: pre-deployment
and deployment. During the pre-deployment stage, the dataset is preprocessed, including data transformation,
handling irrelevant and redundant data, and ensuring data balancing. Minority samples are increased using
CWGAN to avoid class imbalance. Features are then selected using BGGOA, resulting in a feature-reduced dataset
used for training and testing ensemble deep learning classifiers, specifically the Novel Pyramid Depth-wise
Separable-MobileNetV3 (PyDS-MV3) and Deformable Convolutional Residual Neural Network (DCRNN),
termed PDSMV3-DCRNN. During the deployment phase, the Boosted ConvNeXt approach extracts URL features
fed into the trained classifier to predict "phishing" or "benign". According to experimental findings, the proposed
solution outperforms all other tested approaches, displaying a faster training time of 0.11 s and achieving an
optimal accuracy of 99.21%.
1. Introduction
Due to their rapid growth, phishing attacks are one of the significant
sources of anxiety. Attackers often use this damaging and successful
tactic to fool victims into disclosing personal information such as credit
card numbers and passwords (Jha et al., 2023; Rao et al., 2020). Website
phishing is a common attack strategy in which the attacker impersonates
reputable websites such as Amazon, eBay, and Facebook to trick users
into visiting fraudulent websites. While the websites used for phishing
mimic legitimate websites, it is tough for the average person to distin-
guish between them (Jafari and Aghaee-Maybodi, 2024). Because most
visitors to a website won’t look through the entire URL, attackers can
quickly obtain sensitive and personal information. Many anti-phishing
techniques have emerged recently to identify potential phishing
threats (Bozkir et al., 2023; Bountakas and Xenakis, 2023; Tang and
Mahmoud, 2021) early on and shield users from these attack vectors.
Phishing assault development is hindered by the increasing use of DL
device-based security methods in numerous industries (Das et al., 2021).
While machine learning (ML)-based methods yield higher detection
accuracy, they have several significant shortcomings (H. Shirazi et al.,
2023; C. Zonyfar et al., 2023). (a) The incapacity to extract semantic
patterns; that is, because the URL is assessed from a specific angle, not all
of the characteristics of phishing websites are removed. (b) Manual
feature engineering is used in feature extraction, necessitating
* Corresponding author.
E-mail addresses: 241fg04002@vignan.ac.in (Y.B. Prasad), dean_addl._acse@vignan.ac.in (V. Dondeti).
Contents lists available at ScienceDirect
Computers & Security
journal homepage: www.elsevier.com/locate/cose
https://doi.org/10.1016/j.cose.2024.104123
Received 6 March 2024; Received in revised form 31 July 2024; Accepted 15 September 2024
Computers & Security 148 (2025) 104123
Available online 17 September 2024
0167-4048/© 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.