Jurnal Elektronik Ilmu Komputer Udayana p-ISSN: 2301-5373 Volume 12, No 1. August 2023 e-ISSN: 2654-5101 83 Music Genre Classification Using Random Forest Model Ivan Luis Simarmata a1 , I Wayan Supriana a2 a Informatics Department, Faculty of Mathematics and Natural Sciences, University of Udayana South Kuta, Badung, Bali, Indonesia 1 ivan_luis030602@protonmaill.com 2 wayan.supirana@unud.ac.id (Corresponding author) Abstract Music genre is a grouping of music based on their style. To group music into certain genres is a long and boring task to do manually because one must listen to each song individually and determine which genre does this song belong to. This process can be made automatic using classification models like Random Forest. The Random Forest model is a mutated version of the decision tree model, where Random Forest uses multiple decision trees to get a single result. In this paper the model that will be tested is the Random Forest model and XGB Classification model for comparison. The XGB Classification model is used to compare because it is similar to the Random Forest model. XGB Classification is a mutated decision tree model which uses CART as its tree. The results show that with the Random Forest model, an accuracy of 72% is achieved when all audio features are included, and with the XGB Classification, an accuracy of 73% is achieved with some audio features dropped. Keywords: Classification, Decision Tree, Random Forest, Accuracy, XGB Classification 1. Introduction Music Genre is a way to categorize or classify music based on the style and features of the music [1]. Classifying music based on genre can be done manually by listening to each song individually. But doing so will consume a lot of time and effort making it an ineffective method. So, an automatic process is required to help classify music [2]. Music Genre Classification have been a problem that has been studied by the Music Information Retrieval community. There are many classification models that can be used, including Support Vector Machine, K-Nearest Neighbor, Decision Tree, and many more. One of the famous models is the K-Nearest Neighbor [2]. But for this research, the model that will be used is the Random Forest model which is the mutated version of the Decision Tree model. Previous research conducted by [3] using decision tree to classify Latin music genre. They used 2 types of decision tree model which is the Categoric Attributes and Regression Trees (CART) and the C4.5 algorithm. By using these models, they only achieved an accuracy between 55% - 62%. Another research conducted by [2] uses the Modified K-Nearest Neighbor model to classify music. The dataset that was used is the GTZAN dataset. From this research it was concluded that the Modified K- Nearest Neighbor model was able to classify music with an accuracy of 55.3%. Based on both studies, the authors goal is to use the Random Forest model and the XGB Classification model to classify music genre since both models are mutated versions of the Decision Tree model [4]– [6]. The dataset that will be uses is the GTZAN data set based on the research done by [2]. 2. Reseach Methods The process of the system will start by analyzing the feature of the audio from the dataset to see which features are related. After that the features will be preprocessed. For the first scenario all features will be included for training and testing, and for the second scenario some features will be dropped to see which model between Random Forest and XGB Classification is more accurate. The flow process of the system can be seen in figure 1.