Research Article
A Generalized Method for Sentiment Analysis across
Different Sources
Abubakar M. Ashir
Department of Computer Engineering, Tishk International University, Erbil, Iraq
CorrespondenceshouldbeaddressedtoAbubakarM.Ashir;abubakar.ashir@tiu.edu.iq
Received 2 October 2021; Revised 27 November 2021; Accepted 5 December 2021; Published 18 December 2021
AcademicEditor:FrancescoRundo
Copyright©2021AbubakarM.Ashir.isisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Sentimentanalysisiswidelyusedinavarietyofapplicationssuchasonlineopiniongatheringforpolicydirectivesingovernment,
monitoring of customers, and staff satisfactions in corporate bodies, in politics and security structures for public tension
monitoring,andsoon.Inrecenttimes,thefieldmetwithnewsetofchallengeswherenewalgorithmshavetocontendwithhighly
unstructuredsourcesforsentimentexpressionsemanatingfromonlinesocialmediafora.Inthisstudy,aruleandlexical-based
procedure is proposed together with unsupervised machine learning to implement sentiment analysis with an improved
generalizationabilityacrossdifferentsources.Todealwithsourcesdevoidofsyntacticandgrammaticalstructure,theapproach
incorporatesaruled-basedtechniqueforemoticondetection,wordcontractionexpansion,noiseremoval,andlexicon-basedtext
preprocessingusinglexicalfeaturessuchaspartofspeech(POS),stopwords,andlemmatizationforlocalcontextanalysis.Atext
isbrokenintonumberoftokenswitheachrepresentingasentenceandthenlexicon-dependentfeaturesareextractedfromeach
token.efeaturesaremergedtogetherusingacombiningfunctionforagiventextbeforebeingusedtotrainamachinelearning
classifier. e proposed combining functions leverage on averaging and information gain concepts. Experimental results with
different machine leaning classifiers indicate that improved performance with great deal of generalization capacity across both
structuredandnonstructuredsourcescanberealized.efindingshowsthatcarefullydesignedlexicalfeaturesreinforcelearning
processinunsupervisedlearningmorethanusingwordembeddingsaloneasthefeatures.Obtainedexperimentalresultsfrom
movie review dataset (recall � 74.9%, precision � 70.9%, F1-score � 72.9%, and accuracy � 72.0%) and twitter samples’ datasets
(recall � 93.4%, precision � 89.5%, F1-score � 91.4%, and accuracy � 91.1%) show the efficacy of the proposed approach in
comparison with other state-of-the-art research studies.
1. Introduction
Sentiment analysis is a part of natural language processing
(NLP) which receives tremendous attention in recent his-
tory. is may not be unconnected to the availability of
social media platforms, big data storage, increased Internet
connectivity, accessibility, and unending desire by big
businessandgovernmentstounderstandpeople’sopinions
forpolicyconceptualizationsandmonitoring.Atthebackof
thisboomistherecentbreakthroughinmachineanddeep
learning algorithms leading to an astronomical improve-
ment in performance of NLP tasks. Sentiment analysis
crisscrosses subfields of computational linguistic and in-
formation retrieval. In general context, the major task in
sentiment analysis has to do with tagging a given text
accordingtoexpressedopinionwhichusuallyinvolvesthree
tasks: (i) determine objectivity of a text (i.e., subjective or
objective),(ii)determinethepolarityofasubjectivetext(i.e.,
positiveornegative),and(iii)determinethestrengthofthe
subjectivetext[1].erearetwomajorapproachesthatexist
in the literature for sentiment analysis: lexicon-based and
machinelearning-basedapproach.Eachoftheseapproaches
hastheirbenefitsanddrawbacks.Lexicon-basedapproachis
arule-basedmethodwhichemployscomputingsentiments
by considering the semantic orientation of the words or
phrasesinthetext[1].isimpliestheuseofadictionaryof
words which are tagged with lexical features such as sen-
timent polarity orientation, part of speech (POS), and
glosses.Infact,theapproachrepresentsapieceofwordasa
tokenorabagofwordswheresemanticorientationofeach
Hindawi
Applied Computational Intelligence and So Computing
Volume 2021, Article ID 2529984, 8 pages
https://doi.org/10.1155/2021/2529984