Sentiment Analysis of Turkish Drug Reviews with Bidirectional Encoder Representations from Transformers MEHMET BOZUYLA, Department of Electrical-Electronics Engineering, Faculty of Engineering, Pamukkale University, Turkey Sentiment analysis of user generated product or service reviews is signiicant to enhance quality. Healthcare related computa- tional linguistics studies particularly analysis of drug based user criticisms have principal importance above all. Sentiment analysis of healthcare reviews reveal the relations between patients, doctors and healthcare services. More speciically, senti- ment analysis of drug reviews may be used to acquire relations such as adverse drug reactions (ADRs), diagnosis-treatment assist, and personalized therapy recommendations. Most of the drug review sentiment studies are in English. Though Turkish is a widely spoken language, there is limited research conducted on medical domain and there is particularly no study related to drug review sentiment analysis. In this study, we generated a Turkish drug review dataset and we evaluated the generated dataset in detail against (i)traditional machine learning algorithms with language pre-processing steps, stemming and feature selection, (ii)deep learning algorithms with word2vec embedding language model and (iii)various bidirectional encoder representations from transformers (BERT) models in terms of sentiment analysis. The experiments show that neural transformers are promising in Turkish drug review sentiment identiication. In particular, Turkish dedicated BERT (BERTurk) resulted in 95.1% weighted-F1 score as the best drug review sentiment prediction performance. CCS Concepts: • Computing methodologies Machine learning algorithms; Natural language processing; • Social and professional topics Medical information policy. Additional Key Words and Phrases: Turkish, Drug Review, Word Embedding, Bidirectional Transformer 1 INTRODUCTION The continuous user text generation on products or services requires analysis of those reviews to enhance the quality of any products or services. Sentiment Analysis (SA) research proposes automated natural language processing (NLP) methods to extract information from the huge user generated data in various domains such as tourism, marketing and even politics [9]. As a recently importance gained SA domain, medical or healthcare review analysis, has relatively infrequent number of studies in the literature compared to traditional user review analysis areas. In particular, academic search on drug review sentiment produces less frequent results. Healthcare based user reviews may be used to extract information about diagnosis of disease, status of patient’s health or efectiveness of a medical treatment. For example, the patient may express his/her experience for a treatment received for a disease [44]. Analysis of drug reviews may also be used to obtain eicient insights for healthcare domain. In particular, SA of drug reviews may be used to address adverse drug relations (ADRs), to aid in diagnosis and treatment choices and to ind unexpected drug symptoms [7, 36]. These applications in general requires structured data which is limited in quantity. Nevertheless, drug-users persistently produce written data being a robust alternative to Author’s address: Mehmet Bozuyla, mbozuyla05@posta.pau.edu.tr, Department of Electrical-Electronics Engineering, Faculty of Engineering, Pamukkale University, Denizli, Turkey, 20020. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from permissions@acm.org. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. 2375-4699/2023/10-ART $15.00 https://doi.org/10.1145/3626523 ACM Trans. Asian Low-Resour. Lang. Inf. Process.