International Journal of Computer Applications (0975 - 8887) Volume 174 - No.10, January 2021 COVFake: A Word Embedding Coupled with LSTM Approach for COVID Related Fake News Detection Muhammad Usama Islam School of Computing and Informatics University of Louisiana at Lafayette Md. Mobarak Hossain Department of CSE Dhaka University of Engineering and Technology Mohammod Abul Kashem Department of CSE Dhaka University of Engineering and Technology ABSTRACT Coronavirus (COVID) took a substantial toll on human life with its unprecedented arrival in human sphere. An unforeseen cir- cumstance which lead to various types of guidelines of proce- dures directed from the monitoring bodies including face-mask guideline, hand-wash guidelines and so forth. However, with the advent of this disease, misinformation became a causal fac- tor to this scenario albeit claiming millions of life in the pro- cess. A threatening disease coupled with misinformation has cre- ated a disastrous scenario in human life. Our approach, ex- ploits the power of natural language processing, speciﬁcally word embedding and Long short term memory (LSTM) to de- tect the COVID related fake news. Our model performs with a promising accuracy of 96% which concludes our effort of con- tribution to this massive outbreak from a linguistic standpoint. General Terms Coronavirus, Fake news Keywords Coronavirus news, Fake news, Natural language processing, text analysis, Long short term memory, Word embedding, Fake news detection 1. INTRODUCTION Fake news as a term gained immense popularity with the advent of social media where alternative media companies started twisting the truth towards a certain narrative to feed false narratives or pro- mote conspiracy theories [23]. While one of its major blows was at elections [3], the major effect was human becoming polarized more than ever in taking certain decisions [26].Fake news not only has changed how a government shall respond to an issue but also changed the perspective of how its population would react to the policies. To a larger extent, misinformation and fake news became intertwined to form own policy guidelines that started dictating the lives of the people [5]. Computational linguists having understood the harsh rhetoric and effect of fake news, has successfully started working towards natural language processing based models to de- tect the fake news [19] Coronavirus, a virus of SARS family, affected the lifespan of this entire planet starting from late December 2019 by claiming mil- lions of lives ultimately transforming to a pandemic [18].With new guidelines being issued from the regulating authorities such as World Health Organization (WHO), Center for Disease Con- trol(CDC) and so forth, it was anticipated that the polarized com- munity coupled with misinformation would be exposed to fake news in COVID era [16, 4]. In our investigation, we tried to accumulate these two factors in to demonstrate the effect of fake news on COVID followed by a detec- tion model inspired by LSTM [9, 24] and word embedding [10, 8] to classify and detect the COVID related fake news. The rest of the paper is organized as follows. Section 2 discusses the related works carried out on COVID related fake news fol- lowed by the impending motivation to carry out the research on this ﬁeld. Section 3 contains our proposed approach that includes the dataset description, dataset preparation, and language process- ing model.Afterwards, Section 4 discusses about implementation and evaluation of our system followed by an analysis of results and lastly Section 5 contains the conclusion and future scope of this work. 2. LITERATURE REVIEW Various research works has been carried out in natural language processing and understanding with the aid of machine learning and deep learning algorithms [12, 30].Ahmed and colleagues [2] has in- vestigated fake news and built a classiﬁer by utilizing unigarm fea- tures and support vector machine (SVM) classifer of linear nature to classify the fake news. Similar works on SVM as well as logistic regression is performed in [27].Granik.et.al [11] took a different approach and used naive bayes classiﬁer to detect the fake news. An interesting co-similarity of all the research work above is the utilization of machine learning algorithms for fake news detection approach which later paved the way for word embedding [10, 8], LSTM [9, 24] and deep learning [22] to take over. LSTMs as well as word embedding has been well utilized for text based analysis for the case natural language processing, sentiment analysis and opinion mining [10, 8]. Although, use of LSTM has seen a boom since the data explosion phenomenon since 2013 albeit being devel- oped in late 90’s but word embedding has been utilized widely for text based representation and understanding. Liang Wu and Huan Liu [28] has investigated various embedding including graph em- bedding to understand and perceive the footprint of fake-news de- livery through social network.Similar research work on fake news detection has been carried out in [7]. While substantial work has 1