IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.14, No.2, April 2020, pp. 169~178 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: 10.22146/ijccs.51743 169 Received November 21 th ,2019; Revised April 10 th , 2020; Accepted April 29 th , 2020 Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection Auliya Rahman Isnain* 1 , Agus Sihabuddin 2 , Yohanes Suyanto 3 1 Master Program of Computer Science, FMIPA UGM, Yogyakarta, Indonesia 2,3 Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta, Indonesia e-mail: * 1 rahman.isnain@gmail.com, 2 a_sihabudin@ugm.ac.id, 3 yanto@ugm.ac.id Abstrak Saat ini pembahasan mengenai ujaran kebencian di Indonesia sedang hangat, terutama melalui media sosial. Ujaran kebencian merupakan komunikasi yang meremehkan sesorang atau kelompok berdasarkan karakteristik seperti ( ras, etnis, jenis kelamin, kewarganegaraan, agama dan oragnisasi). Twitter salah satu media sosial yang digunakan seseorang untuk mengutarakan perasaan dan opini melalui tweet, termasuk tweet yang megandung ujaran kebencian. Karena twitter mempunyai pengaruh besar bagi kesuksesan ataupun kehancuran citra seseorang. Penelitian ini bertujuan untuk mendeteksi ujaran kebencian atau bukan ujaran kebencian tweet berbahasa Indonesiadengan mengunakan metode Bidirectional Long Short Term Memory dan metode ekstrasi fitur word2vec dengan arsitektur Continuous bag-of-word (CBOW). Untuk pengujian metode BiLSTM dengan perhitungan nilai akurasi, presisi, recall, dan F-measure. Penggunaan word2vec dan metode Bidirectional Long Short Term Memory dengan arsitektur CBOW, dengan epoch 10, learning rate 0.001 dan jumlah neuron 200 pada layer tersembunyi, menghasilkan tingkat akurasi 94,66%, dengan masing-masing nilai presisi 99,08%, recall 93,74% dan F-measure 96,29%. Sedangkan untuk Bidirectional Long Short Term Memory dengan tiga layer memiliki akurasi 96,93%. Penambahan satu layer pada BiLSTM meningkat 2,27%. Kata kunci— Ujaran Kebencian, LSTM, BiLSTM, Word2vec, CBOW, Skipgram, Twitter Abstract Currently, the discussion about hate speech in Indonesia is warm, primarily through social media. Hate speech is communication that disparages a person or group based on characteristics such as (race, ethnicity, gender, citizenship, religion and organization). Twitter is one of the social media that someone uses to express their feelings and opinions through tweets, including tweets that contain expressions of hatred because Twitter has a significant influence on the success or destruction of one's image. This study aims to detect hate speech or not hate Indonesian speech tweets by using the Bidirectional Long Short Term Memory method and the word2vec feature extraction method with Continuous bag-of-word (CBOW) architecture. For testing the BiLSTM purpose with the calculation of the value of accuracy, precision, recall, and F-measure. The use of word2vec and the Bidirectional Long Short Term Memory method with CBOW architecture, with epoch 10, learning rate 0.001 and the number of neurons 200 on the hidden layer, produce an accuracy rate of 94.66%, with each precision value of 99.08%, recall 93, 74% and F-measure 96.29%. In contrast, the Bidirectional Long Short Term Memory with three layers has an accuracy of 96.93%. The addition of one layer to BiLSTM increased by 2.27%. Keywords— Hate Speech, LSTM, BiLSTM, Word2vec, CBOW, Skipgram, Twitter