Logistic Regression Based Classification of Spam and Non-Spam Emails Shahbaz Ahmad Khanday 1 , Suraiya Parveen 2 {shahbazshaban10@gmail.com 1 , husainsuraiya@gmail.com 2 } Jamia Hamdard University New Delhi India 1,2 Abstract. An email client receives emails from different websites, portals and domains, which can be an advertisement. Receiving a bulk amount of emails can cause serious damages like suspension of a particular email id. Mostly an email client gets exposed to the number of malicious receipts by registering an email account to a web portal, which in turn sends a bulk amount of emails. One of the solutions to escape from spam emails is to develop a decision based system which can classify the spam and non-spam emails. This can be achieved using different machine learning and deep learning and deep learning algorithms to classify the spam and non-spam emails by accessing the received emails of an email client. The machine learning approaches and mechanisms like SVM, naive Bayesian classifier, artificial neural networks and random forests can be of important help to determine spam emails. After classifying a spam email source a user can navigate, block and report the source of the spam email generator like spam-bots. Keywords: machine learning, decision tree, support vector machine (SVM), logistic regression, artificial neural networks, naive Bayesian classifier and spam-bots. 1 Introduction A common person can receive a huge amount of emails in a day. The email user can receive emails from different sources related to the different day to day activities like social networking, files and sharing, online shopping, e billing, e commerce and applications etc. One should be able to differentiate between important and useful emails over spam or junk emails. Once a user gets exposed to the spam and malicious sources he will receive a large amount of emails from various unknown sources. Therefore it becomes a hectic and time consuming task for an email user to make a selection and difference of all the received emails, which may contain an important piece of data or information. The condition becomes very risky when an email client is trapped into a malicious act and then the security and privacy of a system could be breached. The email user could be trapped into a phishing act initiated by the cyber criminals. It is very hard to recover from such situations and most of the times an email user gets attracted to the spam emails and respond to them. In most of the cases the blocking and reporting of these spam email sources become useless, as the senders change their location continuously. One of the alternatives can be tracking those particular IP addresses from where an email user receives these spam emails, but the task becomes harder when the number of IP addresses are many but not fewer. And the major part is when the senders change their locations and targets. One of the ICIDSSD 2020, February 27-28, New Delhi, India Copyright © 2021 EAI DOI 10.4108/eai.27-2-2020.2303291