  Citation: Tamamadin, M.; Lee, C.; Kee, S.-H.; Yee, J.-J. Regional Typhoon Track Prediction Using Ensemble k-Nearest Neighbor Machine Learning in the GIS Environment. Remote Sens. 2022, 14, 5292. https:// doi.org/10.3390/rs14215292 Academic Editor: Silas Michaelides Received: 29 August 2022 Accepted: 19 October 2022 Published: 22 October 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). remote sensing Article Regional Typhoon Track Prediction Using Ensemble k-Nearest Neighbor Machine Learning in the GIS Environment Mamad Tamamadin 1,2 , Changkye Lee 3 , Seong-Hoon Kee 1,3 and Jurng-Jae Yee 1,3, * 1 Department of ICT Integrated Ocean Smart Cities Engineering, Dong-A University, Busan 49315, Korea 2 Department of Meteorology, Institut Teknologi Bandung, Bandung 40132, Indonesia 3 University Core Research Center for Disaster-free & Safe Ocean City Construction, Dong-A University, Busan 49315, Korea * Correspondence: jjyee@dau.ac.kr Abstract: This paper presents a novel approach for typhoon track prediction that potentially impacts a region using ensemble k-Nearest Neighbor (k-NN) in a GIS environment. In this work, the past typhoon tracks are zonally split into left and right classes by the current typhoon track and then grouped as an ensemble member containing three (left-center-right) typhoons. The proximity of the current typhoon to the left and/or right class is determined by using a supervised classification k-NN algorithm. The track dataset created from the current and similar class typhoons is trained by using the supervised regression k-NN to predict current typhoon tracks. The ensemble averaging is performed for all typhoon track groups to obtain the final track prediction. It is found that the number of ensemble members does not necessarily affect the accuracy; the determination of similarity at the beginning, however, plays an important key role. A series of tests yields that the present method is able to produce a typhoon track prediction with a fast simulation time, high accuracy, and long duration. Keywords: k-Nearest Neighbor; GIS processing; machine learning; similarity; typhoon track prediction 1. Introduction Typhoons are extreme weather events that normally harm coastal areas [1]. Typhoon disasters cause heavy winds, floods, and extreme waves [2], which can damage infrastruc- ture, transportation, and human activity [3,4]. The city of Busan, located on the borders of the Korea/Tsushima Strait, is often impacted by many typhoons [5,6], the impacts of which are felt during direct landfalls or passage through surrounding areas, namely Ulsan city [7] and Gyeongsangnam-do [8]. To reduce these greater severe impacts, typhoon prediction is essential. However, there are still problems related to the accuracy, especially in predicting the track, intensity, and impact risk. Improvements or developments of a new approach are required to produce a more accurate prediction of typhoons. This work aims to develop a new approach to predict the more accurate typhoon tracks approaching or making landfall in a region. The following forecasting models have been developed for operational and research use to anticipate typhoon impacts [9]: (a) averaging across occurrences, (b) numerical and dynamical modeling, (c) statistical model, (d) pattern similarities, (e) data assimilation, and (f) microseismic signal. Firstly, the averaging technique is to extrapolate typhoon tracks in which performance depends on past typhoon position selection. Secondly, numerical and dynamical modeling aims to predict typhoons using a numerical approximation of mathematical equations describing the physical forces affecting the cyclone [10,11]. To utilize the method, supercomputers that repetitively calculate values in every grid using input data, such as global weather forecasts and static geographical data as initialization and boundary conditions, are required [12]. In addition, there is still a lack of accuracy in this method due to the inaccurate vortex initialization of typhoons, incomplete representation Remote Sens. 2022, 14, 5292. https://doi.org/10.3390/rs14215292 https://www.mdpi.com/journal/remotesensing