International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8 Issue-4, November 2019 12287 Retrieval Number: D4308118419/2019©BEIESP DOI:10.35940/ijrte.D4308.118419 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Rumor Detection System for Twitter (A Micro-Blogging Site) Sakshi Yadav, Anuradha Purohit Abstract— Micro-blog provides a platform for the users to transfer their thoughts and information in limited words more expressively. Its concise and easy to access nature makes it popular among every age group. Inspite, of all its pros and popularity, some people use it to achieve their bad motives i.e. to misguide people and create violence. To overcome this problem a system is required that will help to detect fake tweets in a limited amount of time. In this paper, a feature based approach for rumor detection has been proposed. The proposed approach utilizes 9 features which shows author as well as readers reaction to identify rumor tweets which may differ for different users in different situations. For experimentation synthetic data and from Pheme has been utilize. A comparative study of the approach for the datasets has been done on the basis of evaluation parameters Recall, Precision and f- measure. Satisfactory results have been obtained for Pheme data with less number of features as compare to synthetic dataset. Keywords: Micro-blogs, Social Media, Rumor, Machine learning algorithms. I. INTRODUCTION Nowadays, micro-blog systems are more popular. The reason for the popularity is fast transfer rate of information. By using a micro-blog system a user can share information and views by publishing a post, re-post and re-post adding own comments [1]. Some popular micro-blog systems are Twitter, Tumblr and Sina weibo. One of the most popular micro-blog system in India is Twitter, from businessman to politicians and common man to popular personalities all are active on Twitter. As popularity increases the probability of the number of fake tweets also increases. Credibility matters a lot when it is all about information. As information can create violence on the other side it will help to solve many serious issues. The main motto of these fake tweets is to create violence and misguide people. Rumor is a piece of information whose sources are untrustworthy. These are likely to be generated under crisis and extremity, causing public panic, disrupting the social order, decrease trust on government and directly These fake tweets are known as rumors and affects security of the nation. Revised Manuscript Received on November 22, 2019 Sakshi Yadav, Sakshi Yadav, M.E. Scholar, Computer Engineering Department, S.G.S.I.T.S. Indore, sakshiyad06@gmail.com Anuradha Purohit, Anuradha Purohit, Associate Professor, Computer Engineering Department, S.G.S.I.T.S. Indore, anuradhapurohit78@gmail.com For example, in June 2016, after banknote demonetization was officially announced. RBI declared the message of invalid 10 rupee coin; it was spread so quickly on social networking mostly in the area of metro cities like Delhi. The declaration became the reason for not accepting 10-rupee coin by shopkeepers, rickshaw drivers and creating confusion among people. This rumor became a great issue among people. After all this RBI confirmed that, who are not accepting the currency will have to face legal action. Credibility is a major concern for researchers. That’s why researcher focus on the reliability of the information which spread through online platform using features extracted from tweets [2]. Some researchers make use of previously done survey by applying k-nearest neighbor and Naive Bayes classifier which are machine learning algorithm and helps to improve the efficiency of existing approach. Many researchers have shown interest in automatic rumor detection method on the online social platform. These methods can be classified into two categories: classification-based approach and propagation-based approach [3]. In the direction of automatic rumor detection, a classification method has been proposed [4] which treats rumor detection as a binary classification problem and make use of a combination of implicit features and shallow features of the messages. As the popularity of micro-blogs increases the amount of data also increases. So, finding a recent trend topic becomes a tough task. So, their importance may vary with time as well as the situation. Since the feature based identification is more reliable and dependable, the proposed approach is based on the feature- based rumor detection system. In this paper, an approach has been proposed which is based on features where the behavior of the user is treated as hidden clues to find rumor posts. Proposed approach works in three phases: 1) based on collected micro-blogs of Twitter, features of user’s behavior have been gathered. 2) Three most popular algorithms have been used to train classifiers for rumor detection. These are SVM (support vector machine), RF (random forest) and MaxEnt (maximum entropy). 3) Trained classifier from phase two used to predict whether a post is a rumor or not. Experiments are conducted on two types of the dataset of Twitter, one of them is synthetic dataset which is based on sentiments and another one is Pheme dataset which consists of rumors and normal post related to 5 breaking news to show the performance of the designed system. Evaluation parameters such as, precision, recall and f-measure are calculated which in return shows that fewer features can improve performance.