Proximity-based anomaly detection in Securing Water Treatment Ermiyas Birihanu Data Science and Engineering ELTE E¨ otv¨ os Lor´ and University Budapest, Hungary ermiyasbirihanu@inf.elte.hu ´ Aron Barcsa-Szab´ o Data Science and Engineering ELTE E¨ otv¨ os Lor´ and University Budapest, Hungary ikx1sh@inf.elte.hu Imre Lend´ ak Data Science and Engineering ELTE E¨ otv¨ os Lor´ and University, Budapest, Hungary lendak@inf.elte.hu Abstract—Industrial Control Systems (ICSs) utilize different sensors and various embedded systems to operate. Devices often communicate using protocols like Siemens Step 7 and Modbus, which were designed for use in closed networks many years ago and are vulnerable to attacks. The goal of this study is to detect anomalies in industrial control systems using a proximity-based approach on the Securing Water Treatment (SWaT) dataset. We encoded categorical data using one hot encoding and normalized numerical data using min max scaling. The experiment shown that by adopting a proximity-based approach, we can obtain state-of-the-art 99% precision and 98% recall and able to identify 35 out of 37 attack points, indicating that the suggested methodology is suitable for usage in industrial control system scenarios. Index Terms—Anomaly Detection, Proximity, Industrial con- trol system and SWaT I. I NTRODUCTION AND PROBLEM DEFINITION Industrial control systems (ICSs) are built by combining computational algorithms and physical components for various mission-critical tasks. It manages different operations such as opening and closing valves, starting, monitoring and ending a process to the automation system, for instance automatic water chlorination in water treatment. When industrial control systems initially start operating, they do the tasks assigned to them within the established parameters. System operators can observe how the physical system works and also intervene in the physical system through ICS. These systems are linked together by a variety of electrical circuits and sensors, and they may interact with one another across a network, allowing the various system components to function together in harmony. Although traditional ICSs were isolated from the Internet, these systems are today more vulnerable to malicious attacks as they become more digitalized and connected to the Internet. ICS equipment often communicate using protocols such as Siemens Step 7 and Modbus, which were created for use in closed networks many years ago [1]. ICS equipment were vulnerable, posing cyber-physical dangers ranging from sub- stantial production disruption to hazardous failures that might affect human safety. A successful attack against ICS would have a significant financial impact on the system operator. Operational disturbance, equipment damage, corporate waste, and intellectual property theft are also possible consequences of ICS attacks. The SWaT dataset was collected from a fully operational scaled-down water treatment plant. This testbed consists of six industrial processes (marked P1 to P6) that work together to treat and distribute water [2]. ICS in SWaT testbed has two parts: actuator and sensor. An actuator is a type of motor that controls or moves a mechanism or system and it is powered by an energy source. This source is usually electric current, hydraulic fluid pressure or pneumatic pressure. Actuators are where physical work is done. Sensors are devices that detect changes in the physical environment such as temperature, pressure, distance. The number of ICS attacks has risen in recent years, due to the enhanced Internet connectivity and the integration of commercial-off-the-shelf technology(COTS) [3]. Intrusion detection systems (IDS) are used to detect attacks against networks and systems. The main task of these systems is to detect malicious activities and to report the type of attacks. These systems are divided into two types as signature- based and Anomaly-based [4]. Signature based IDSs uses known attack signatures in the database to detect attacks. These systems can only detect previously known intrusions they are not able to detecting and unknown previously attacks. On the other hand, anomaly based IDSs detect anomalies in network traffic without using attack signatures. Since these systems do not use any attack signature, they may generate false alarms. It is also possible to detect new attacks with anomaly based IDSs. There are various approaches for identifying anomalies [5] [6] in different domains, but to the best of the re- searcher’s knowledge, no studies were conducted to discover a way to design such a proximity model in the domain of industrial control systems for securing water treatment. Anomaly detection models can be classified as extreme value analysis, probabilistic and statistical models, linear models, proximity-based models, information theoretic Models, and high-dimensional outlier detection. The main objective of this research was analyze and compare the utility of proximity- based anomaly detection techniques in improving the security of water treatment systems. The contribution of this study was: Using proximity methods for industrial control systems and comparing its performance in detecting anomalies in 978-1-6654- 9653-7/22/$31.00 © 2022 IEEE