Sentiment Analysis of Turkish Drug Reviews with Bidirectional
Encoder Representations from Transformers
MEHMET BOZUYLA, Department of Electrical-Electronics Engineering, Faculty of Engineering, Pamukkale
University, Turkey
Sentiment analysis of user generated product or service reviews is signiicant to enhance quality. Healthcare related computa-
tional linguistics studies particularly analysis of drug based user criticisms have principal importance above all. Sentiment
analysis of healthcare reviews reveal the relations between patients, doctors and healthcare services. More speciically, senti-
ment analysis of drug reviews may be used to acquire relations such as adverse drug reactions (ADRs), diagnosis-treatment
assist, and personalized therapy recommendations. Most of the drug review sentiment studies are in English. Though Turkish
is a widely spoken language, there is limited research conducted on medical domain and there is particularly no study
related to drug review sentiment analysis. In this study, we generated a Turkish drug review dataset and we evaluated the
generated dataset in detail against (i)traditional machine learning algorithms with language pre-processing steps, stemming
and feature selection, (ii)deep learning algorithms with word2vec embedding language model and (iii)various bidirectional
encoder representations from transformers (BERT) models in terms of sentiment analysis. The experiments show that neural
transformers are promising in Turkish drug review sentiment identiication. In particular, Turkish dedicated BERT (BERTurk)
resulted in 95.1% weighted-F1 score as the best drug review sentiment prediction performance.
CCS Concepts: • Computing methodologies → Machine learning algorithms; Natural language processing; • Social
and professional topics → Medical information policy.
Additional Key Words and Phrases: Turkish, Drug Review, Word Embedding, Bidirectional Transformer
1 INTRODUCTION
The continuous user text generation on products or services requires analysis of those reviews to enhance the
quality of any products or services. Sentiment Analysis (SA) research proposes automated natural language
processing (NLP) methods to extract information from the huge user generated data in various domains such as
tourism, marketing and even politics [9]. As a recently importance gained SA domain, medical or healthcare
review analysis, has relatively infrequent number of studies in the literature compared to traditional user review
analysis areas. In particular, academic search on drug review sentiment produces less frequent results.
Healthcare based user reviews may be used to extract information about diagnosis of disease, status of patient’s
health or efectiveness of a medical treatment. For example, the patient may express his/her experience for a
treatment received for a disease [44].
Analysis of drug reviews may also be used to obtain eicient insights for healthcare domain. In particular, SA
of drug reviews may be used to address adverse drug relations (ADRs), to aid in diagnosis and treatment choices
and to ind unexpected drug symptoms [7, 36]. These applications in general requires structured data which
is limited in quantity. Nevertheless, drug-users persistently produce written data being a robust alternative to
Author’s address: Mehmet Bozuyla, mbozuyla05@posta.pau.edu.tr, Department of Electrical-Electronics Engineering, Faculty of Engineering,
Pamukkale University, Denizli, Turkey, 20020.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page.
Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from
permissions@acm.org.
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
2375-4699/2023/10-ART $15.00
https://doi.org/10.1145/3626523
ACM Trans. Asian Low-Resour. Lang. Inf. Process.