Accepted by editor: 11-12-2020 | Final Revision: 21-12-2020 | Online Publication : 22-12-2020
1142
Accredited by National Journal Accreditation (ARJUNA) Managed by
Ministry of Research, Technology, and Higher Education, Republic Indonesia with Second Grade (Peringkat 2, Sinta 2)
since year 2017 to 2021 according to the decree No. 10/E/KPT/2019.
Published online on the journal’s webpage: http://jurnal.iaii.or.id
RESTI journal
(System Engineering and Information Technology)
Vol . 4 No. 6 (2020) 1142 – 1148 ISSN Electronic Media: 2580-0760
Sentiment Analysis for Detecting Cyberbullying Using TF-IDF and SVM
Wahyu Adi Prabowo
1
, Fitriani Azizah
2
1,2
Department of Informatics Engineering , Faculty of Informatics, Telkom Institute of Technology Purwokerto
1
wahyuadi@ittelkom-pwt.ac.id,
2
16102051@ittelkom-pwt.ac.id
Abstract
Social media has become a new method of today’s communication in a new digitalize era. Children and adults have used social
media a lot in interacting with others. Therefore social media has shifted conventional communication into digital one. This
digital development on social media is a serious problem that must be faced because it has been found that there are more and
more acts of cyberbullying. This act of cyberbullying can attack the psychic, causing depression up to suicide. The dangers of
cyberbullying are troubling and cause concern to the community. Therefore, this study will analyze the sentiment on the
comments contained on social media to find out the value of sentiment from comments on social media platforms. The comment
data will be processed at the sentiment analysis stage, with the following steps are: preprocessing stage, Term Frequency-
Inverse Document Frequency (TF-IDF), and the Support Vector Machine (SVM) classification method. Comment data to be
classified as 1500 data taken using crawling data through libraries in python programming and divided into 80% data training
and 20% data testing. Based on the results of the test, the accuracy value is 93%, the precision value is 95%, and the recall
value is 97%. In this research, a system model design is also carried out where the system can be integrated with the browser
to open a user page on the classification of comments that have been input into the system.
Keywords: Preprocessing, Term Frequency and Inverse Document Frequency, Support Vector Machine, Confusion Matrix,
Application, Sentiment Analysis
© 2020 RESTI Journal
1. Introduction
For decades, the internet has been a part of life that can
dynamically change the nature of a person such as
children and adults [1], [2]. Internet is a type of network
that connects information and communication globally.
The internet is also an alternative way to obtain
information sources directly [3]. The rapid growth of the
social network has changed the meaning of friendship,
relationships, and social communication. People have
been interacting through social media such as Facebook,
Twitter, Myspace, and YouTube that are accessed
simultaneously [4]. From the rapid growth of social
media, cyberbullying becomes one of the serious
problems in social networks, especially for teenagers
and adults [2]. Cyberbullying is defined as an aggressive
and deliberate act to harm someone committed by a
group or individual by using a form of electronic contact
repeatedly or from time to time against a victim who
cannot easily defend himself [5]. People have begun to
realize that the incidence of cyberbullying has increased
in recent decades, and some research shows that half of
teenagers and society experience cyberbullying [6].
Even the effects of cyberbullying contribute to
depressive stress, decreased self-esteem, despair, and
suicidal desire among adolescents [7].
Social media is a medium to communicate its existence
not only through media text but also users can use image
and video media. It is from these materials that the media
is widespread on the internet with the reach can quickly
spread widely. With this capability there are many
opportunities and opportunities from the internet shown,
but there are concerns about increased online activity
that could lead to the onset of deliberate crime and
harassment such as cyberbullying. Social media apps are
already very popular among everyone and the growing
popularity of social media platforms is also increasing
cyberbullying that occurs through social media [8], [9].
This cyberbullying phenomenon certainly gets special
attention from the public and social media users, the role
of information technology is a particular concern for
researchers to develop technology to detect cases of
cyberbullying. In detecting cyberbullying, the researcher
can use the application of data and data mining concepts
in finding text patterns, the process of analyzing text,
and the process of summarizing useful information [10].
Even in research with the naïve Bayes method, the