Research Article A Hybrid Feature Extraction Method for Nepali COVID-19-Related Tweets Classification T.B. Shahi , 1,2 C. Sitaula , 1,3 and N. Paudel 1 1 Central Department of Computer Science and Information Technology, Tribhuvan University, 44600 Kathmandu, Nepal 2 School of Engineering and Technology, Central Queensland University, Rockhampton 4701, QLD, Australia 3 Department of Electrical and Computer Systems Engineering, Monash University, Clayton 3800, VIC, Australia Correspondence should be addressed to N. Paudel; nawarajpaudel@cdcsit.edu.np Received 7 December 2021; Accepted 10 February 2022; Published 9 March 2022 Academic Editor: ippa Reddy G Copyright © 2022 T.B. Shahi et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. COVID-19isoneofthedeadliestviruses,whichhaskilledmillionsofpeoplearoundtheworldtothisdate.ereasonforpeoples’ deathisnotonlylinkedtoitsinfectionbutalsotopeoples’mentalstatesandsentimentstriggeredbythefearofthevirus.People’s sentiments,whicharepredominantlyavailableintheformofposts/tweetsonsocialmedia,canbeinterpretedusingtwokindsof information: syntactical and semantic. Herein, we propose to analyze peoples’ sentiment using both kinds of information (syntactical and semantic) on the COVID-19-related twitter dataset available in the Nepali language. For this, we, ﬁrst, use two widelyusedtextrepresentationmethods:TF-IDFandFastTextandthencombinethemtoachievethehybridfeaturestocapture the highly discriminating features. Second, we implement nine widely used machine learning classiﬁers (Logistic Regression, Support Vector Machine, Naive Bayes, K-Nearest Neighbor, Decision Trees, Random Forest, Extreme Tree classiﬁer, AdaBoost, and Multilayer Perceptron), based on the three feature representation methods: TF-IDF, FastText, and Hybrid. To evaluate our methods, we use a publicly available Nepali-COVID-19 tweets dataset, NepCov19Tweets, which consists of Nepali tweets categorized into three classes (Positive, Negative, and Neutral). e evaluation results on the NepCOV19Tweets show that the hybrid feature extraction method not only outperforms the other two individual feature extraction methods while using nine diﬀerentmachinelearningalgorithmsbutalsoprovidesexcellentperformancewhencomparedwiththestate-of-the-artmethods. 1.Introduction Natural language processing (NLP) techniques have been developed to assess peoples’ sentiments on various topics. Basically, the sentiment assessment of documents into Negative, Positive, or Neutral is known as sentiment anal- ysis. For the sentiment analysis of documents, we basically deal with sentiment classiﬁcation, topic modeling, and opinion mining. Particularly, we obtain textual documents from various sources, such as social media posts and news documents. ese documents reﬂect the peoples’ feelings, wherebywewouldbeabletoidentifytheirsentimentsusing machine learning techniques. Currently, the growth of social media posts, particularly tweets, because of COVID-19, is incredibly increasing. is lets us understand people’s mental stress if we process and analyzethem.Tothisend,thedesignanddevelopmentofan automated AI tool is essential to understand and deal with peoples’mentalstresses.erearefewresearchworksofAI model developed on Nepali COVID-19-related sentiment analysisintheliterature;therefore,wediscussthesentiment analysis works carried out in the Nepali language as well as few other languages, such as English. Recent works [1–8] on COVID-19 tweets sentiment analysis in English and other languages [8] underscore the eﬃcacy of data-driven machine learning approaches, where they employed several kinds of analysis such as topic modeling, classiﬁcation, and clustering. Hence, this urges the thorough comparison of machine learning methods in sentiment analysis with the better representation of tweets for sentiment classiﬁcation. For this, they used popular feature extraction methods such as TF-IDF (Term Hindawi Computational Intelligence and Neuroscience Volume 2022, Article ID 5681574, 11 pages https://doi.org/10.1155/2022/5681574