International Journal of Artificial Intelligence & Robotics (IJAIR) E - ISSN : 2686-6269 Vol.2, No.2, 2020, pp.34-41 34 DOI: 10.25139/ijair.v2i2.3030 Comparison of Clustering K-Means, Fuzzy C-Means, and Linkage for Nasa Active Fire Dataset Muchamad Kurniawan 1 , Rani Rotul Muhima 2 , Siti Agustini 3 * 1,2,3 Teknik Informatika ITATS Arief Rachman Hakim 100, Surabaya, Indonesia 1 muchamadkurniawan@itats.ac.id; 2 ranimuhima@itats.ac.id; 3 *sitiagustini@itats.ac.id *corresponding author I. INTRODUCTION The active fire dataset of the National Aeronautics and Space Administration (NASA) is data obtained from the Visible Infrared Imaging Radiometer Suite Sensor (VIIRS), And the resulting image is a spectroradiometer image as shown in Figure 1. In this dataset, the data features have eight features: latitude, longitude, brightness, scan, track, acq date, acq time, satellite, confidence, version, bright_t31, frp, daylight. This dataset is from NASA's official website (https://earthdata.nasa.gov/earth-observation- data/near-real-time/firms/viirs-i-band-active-fire-data).This dataset was studied before. In this study [1], The purpose active fire data are used to prevent forest fires. The features used have been reduced to 2 features: longitude and latitude. The dataset regions are taken only in South and Southeast Asia. The algorithm used is a combination of Local Outlier Factor (LOF) and K-Means. Implementation of the LOF, the accuracy value of K-Means increases compared to Simple K-Means. Certain studies using this dataset include the [2] [3] [4] [5] [6] [7] for the prevention, clustering, and monitoring of forest cover, flare monitoring [8]. Group analysis is to group data into several groups based on data similarity. If there is new data, the similarities in its features will be seen and will be included in certain groups. In group analysis, there are two types of algorithmic approaches, partition- based approaches and hierarchy-based approaches. Partition-based methods include K-means. K-harmonic means, K-modes, Fuzzy C-means, K-Medoid. Meanwhile, based on hierarchy, there are agglomerative linkage methods (single, complete, average), density-based clustering (DBScan), Spectral, and Graph Clustering [9] [10]. For the optimum number of clusters, a partial clustering algorithm needs to be analyzed. Research [11] has contributed to Davies-Bouldin's technical development aside from the advancement of the K-Means method itself. Around the same period, Davies-Boulding and Silhouette index were used to measure the performance of the clustering method [12] [13]. The Dunn and Silhouette experiments are also used to measure clusters on Clustering Large Application (CLARA) and K-Means. By using a statistical approach, research [14] improves the performance of the Dunn index using the K-Means cluster method. Existing Clustering Quality Matrix (CQMs) has been used for internal cluster validity [15]. Our research contributes to the evaluation of the clustering method that best fits this dataset by comparing several methods with cluster measurement using various techniques. In this study, we will use the active fire dataset from NASA with a comparison of partial clustering and hierarchical clustering: the K-means, Fuzzy C-means (FCM), and Linkage. As for the internal cluster analysis, we will use Elbow. The partition of this document shall be divided into four parts: the first part explains the introduction, ABSTRACT One of the causes of forest fires is the lack of speed of handling when a fire occurs. This can be anticipated by determining how many extinguishing units are in the center of the hot spot. To get hotspots, NASA has provided an active fire dataset. The clustering method is used to get the most optimal centroid point. The clustering methods we use are K-Means, Fuzzy C-Means (FCM), and Average Linkage. The reason for using K-means is a simple method and has been applied in various areas. FCM is a partition-based clustering algorithm which is a development of the K-means method. The hierarchical based clustering method is represented by the Average Linkage method. The measurement technique that uses is the sum of the internal distance of each cluster. Elbow evaluation is used to evaluate the optimal cluster. The results obtained after conducting the K-Means trial obtained the best results with a total distance of 145.35 km, and the best clusters from this method were 4 clusters. Meanwhile, the total distance values obtained from the FCM and Linkage methods were 154.13 km and 266.61 km. Keywords : Active fire dataset, K-Means, FCM, Linkage, Elbow Clustering. Article History Received : September, 7 th 2020 Revised : November, 26 th 2020 Accepted : November, 28 th 2020