PDSMV3-DCRNN: A novel ensemble deep learning framework for enhancing phishing detection and URL extraction Y. Bhanu Prasad a,* , Venkatesulu Dondeti b a Department of Computer Science & Engineering, Vignan’s Foundation for Science, Technology and Research (VFSTR) (Deemed to be University), Vadlamudi, Guntur, Andhra Pradesh, 522213, India b Department of Advanced Computer Science & Engineering, Vignan’s Foundation for Science, Technology and Research (VFSTR) (Deemed to be University), Vadlamudi, Guntur, Andhra Pradesh, 522213, India A R T I C L E INFO Keywords: Phishing Uniform resources locator (URL) Binary grey goose optimization algorithm (BGGOA) Pyramid depth wise separable-mobileNetV3 (PyDS-MV3) Deformable convolutional residual neural network (DCRNN) Boosted ConvNeXt ABSTRACT Phishing is a cyber-attack that exploits victims’ technical ignorance or naivety and commonly involves a Uniform Resources Locator (URL). As a result, it is beneﬁcial to examine URLs before accessing them to spot a phishing assault. Several algorithms based on machine learning have been presented to detect phishing attempts. How- ever, these approaches often suffer from lower performance outcomes, such as lower accuracy, longer response times, and higher false positive rates. Furthermore, many existing methods rely heavily on predeﬁned feature sets, which may limit their adaptability and robustness. In contrast, our proposed method leverages a more dynamic feature selection process, which includes the Conditional Wasserstein Generative Adversarial Network (CWGAN) for addressing data imbalance and the Binary Grey Goose Optimization Algorithm (BGGOA) for optimal feature selection. This dynamic approach enhances the model’s ability to adapt to varying data char- acteristics, improving detection performance. The proposed solution is divided into two stages: pre-deployment and deployment. During the pre-deployment stage, the dataset is preprocessed, including data transformation, handling irrelevant and redundant data, and ensuring data balancing. Minority samples are increased using CWGAN to avoid class imbalance. Features are then selected using BGGOA, resulting in a feature-reduced dataset used for training and testing ensemble deep learning classiﬁers, speciﬁcally the Novel Pyramid Depth-wise Separable-MobileNetV3 (PyDS-MV3) and Deformable Convolutional Residual Neural Network (DCRNN), termed PDSMV3-DCRNN. During the deployment phase, the Boosted ConvNeXt approach extracts URL features fed into the trained classiﬁer to predict "phishing" or "benign". According to experimental ﬁndings, the proposed solution outperforms all other tested approaches, displaying a faster training time of 0.11 s and achieving an optimal accuracy of 99.21%. 1. Introduction Due to their rapid growth, phishing attacks are one of the signiﬁcant sources of anxiety. Attackers often use this damaging and successful tactic to fool victims into disclosing personal information such as credit card numbers and passwords (Jha et al., 2023; Rao et al., 2020). Website phishing is a common attack strategy in which the attacker impersonates reputable websites such as Amazon, eBay, and Facebook to trick users into visiting fraudulent websites. While the websites used for phishing mimic legitimate websites, it is tough for the average person to distin- guish between them (Jafari and Aghaee-Maybodi, 2024). Because most visitors to a website won’t look through the entire URL, attackers can quickly obtain sensitive and personal information. Many anti-phishing techniques have emerged recently to identify potential phishing threats (Bozkir et al., 2023; Bountakas and Xenakis, 2023; Tang and Mahmoud, 2021) early on and shield users from these attack vectors. Phishing assault development is hindered by the increasing use of DL device-based security methods in numerous industries (Das et al., 2021). While machine learning (ML)-based methods yield higher detection accuracy, they have several signiﬁcant shortcomings (H. Shirazi et al., 2023; C. Zonyfar et al., 2023). (a) The incapacity to extract semantic patterns; that is, because the URL is assessed from a speciﬁc angle, not all of the characteristics of phishing websites are removed. (b) Manual feature engineering is used in feature extraction, necessitating * Corresponding author. E-mail addresses: 241fg04002@vignan.ac.in (Y.B. Prasad), dean_addl._acse@vignan.ac.in (V. Dondeti). Contents lists available at ScienceDirect Computers & Security journal homepage: www.elsevier.com/locate/cose https://doi.org/10.1016/j.cose.2024.104123 Received 6 March 2024; Received in revised form 31 July 2024; Accepted 15 September 2024 Computers & Security 148 (2025) 104123 Available online 17 September 2024 0167-4048/© 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.