IJARSCT ISSN (Online) 2581-9429 International Journal of Advanced Research in Science, Communication and Technology (IJARSCT) International Open-Access, Double-Blind, Peer-Reviewed, Refereed, Multidisciplinary Online Journal Volume 4, Issue 6, May 2024 Copyright to IJARSCT DOI: 10.48175/568 10 www.ijarsct.co.in Impact Factor: 7.53 Fake News Detection using Machine Learning Anusha Amrutkar 1 , Jay Pawar 2 , Usha Sahal 3 , Dr. B. S. Shirole 4 Department of Computer Engineering Sanghavi College of Engineering, Nashik, India Abstract: In our modern era where the internet is global, everyone relies on various online resources for news. Along with the increase in the use of social media platforms like Facebook, Twitter, etc. news spread rapidly among millions of users within a very short span of time. The spread of fake news has far-reaching consequences like the creation of biased opinions. The project demonstrated for detecting the fake news. The dataset was provided by the company. Here I am performing binary classification of various news articles available online with the help of concepts pertaining to Artificial Intelligence, Natural Language Processing and Machine Learning. Using decision tree classifier provides the ability to classify the news as fake or real. In this project different feature engineering methods for text data has been used like Bag of words model and word embedding model which is going to convert the text data into feature vectors which is sent into machine learning algorithms to classify the news as fake or not. With different features and classification algorithms we are going to classify the news as fake or real and the algorithm with the feature which gives us the best result with that feature extraction method and that algorithm we are going to predict the news as fake or real Keywords: News Identification dataset, Deep Learning, Machine Learning, Classification I. INTRODUCTION Data or information is the most valuable asset. The most important problem to be solved is to evaluate whether the data is relevant or irrelevant. Fake data has a huge impact on lot of people and organizations. Since fake news tends to spread fast than the real news there a need to classify news as fake or not. In the project the dataset used is from Kaggle website where real news and fake news are in two separate datasets we combined both the datasets into one and trained with different machine learning classification algorithms to classify the news as fake or not. In this project different feature engineering methods for text data has been used like Bag of words model and word embedding model which is going to convert the text data into feature vectors which is sent into machine learning algorithms to classify the news as fake or not. With different features and classification algorithms we are going to classify the news as fake or real and the algorithm with the feature which gives us the best result with that feature extraction method and that algorithm we are going to predict the news as fake or real. In this project we will be ignoring attributes like the source of the news, whether it was reported online or in print, etc. and instead focus only the content matter being reported. We aim to use different machine learning algorithms and determine the best way to classify news . II. PURPOSE A fake news classification system using different feature extraction methods and different classification algorithms like Support Vector Machine, Logistic Regression, Gradient Boosting, XG-BOOST, Decision Tree, Random Forest and the best algorithm we are going to use it in predicting the news as fake or real. In order to create a real time application, the algorithm should be fed with the most recent data. Data is of different sizes so that should be properly cleaned to get better results. So we are using different algorithms and feature extraction methods like Bag of words model and Word embedding model to get the best result. III. OBJECTIVE OF SYSTEM To achieve our goal of developing machine learning model to classify news as fake or real, We need perform following tasks in the same order as stated.