Vol.:(0123456789) SN Computer Science (2022) 3:488 https://doi.org/10.1007/s42979-022-01387-4 SN Computer Science ORIGINAL RESEARCH Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method Mithilesh Kumar Pandey 1  · Munindra Kumar Singh 1  · Saurabh Pal 1  · B. B. Tiwari 2 Received: 31 January 2022 / Accepted: 25 August 2022 © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2022 Abstract Phishing is considered a big concern in this age of data and digital technologies because of its signifcant infuence on the banking and online retailing industries. Cybercriminals target all economic activity on the Internet; thus, it is critical to take security precautions to safeguard assets. One of the frst steps in constructing a safe cyberspace is to prevent phishing attacks before they happen. The detection mechanisms for these assaults were created using machine learning and other methods. However, there is still room for improvement in terms of detection accuracy. This paper proposes the optimiza- tion of an ensemble classifcation algorithm for phishing website (PW) detection. The suggested technique was optimised using a hybrid features selection method (Chi-square, extra tree, and heatmap) by modifying numerous machine learning (ML) method parameters, including random forest, naive Bayes, J48, and KNN. These were achieved by rating the optimal classifers and selecting the top classifers to serve as the foundation for the suggested technique. The obtained results by all experiments show that assigned optimized stacking ensemble approach outperforms previous ML-based detection methods. The level of precision attained was 99.7%. Keywords Phishing websites · Random forest · Naïve Bayes · KNN · J48 · Stacked ensemble method and features selection methods: Chi-square, extra tree, and heatmap Introduction The Internet, covering a broad area of our daily lives, is an indispensable element. Many individuals use it for a vari- ety of purposes, including shopping, bill payment, banking, and communication. Users sufer security issues as a result of increased usage, as well as in identifying theft, hacking phishing, and other cybercrimes. The most prevalent cyber- crime assault is phishing. It is characterised as a social engi- neering technique used to trick customers into visiting pho- ney websites to steal sensitive details of customers such as bank details. People often fall for the information included in phishing emails and websites due to a lack of awareness, which is utilised by the attacker as a way of penetrating the user's privacy and obtaining critical information. This occurs when an attacker creates a phishing website that is so similar to legal websites that it is impossible for certain users to tell the diference. Sending an email with links to bogus web- sites that are identical to actual websites is one of the most prevalent strategies employed by fraudsters. They appear to be legitimate pages when they are opened, regarding details of bank account or check account regarding details [1]. This article is part of the topical collection “Advances in Computational Approaches for Artifcial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M Shivakumar. * Saurabh Pal drsaurabhpal@yahoo.co.in Mithilesh Kumar Pandey mithileshkumarmca@gmail.com Munindra Kumar Singh munindra09_vbspu@yahoo.in B. B. Tiwari bbtiwari62@gmail.com 1 Department of Computer Applications, VBS Purvanchal University, Jaunpur, Uttar Pradesh 222001, India 2 Department of Electronics and Communication, VBS Purvanchal University, Jaunpur, Jaunpur, Uttar Pradesh 222001, India