A Cloud-Native Honeynet Automation and Orchestration Framework Research Project for CNIT-555 Akash Ravi, Bhavye Sharma, Avigyan Mukherjee Dept of Computer and Information Technology Purdue University West Lafayette, USA Abstract—This research utilizes containerization techniques to dynamically build honeynets which provide a deceptive environ- ment to the attacker and proposes a framework for its usage in existing cloud infrastructure. We aim to contribute to the existing literature on the creation of cost-effective dynamic honeynets. The proposed architecture can temporarily spin up honeypot services as it detects the attacker and log everything the attacker does for post-incident or experimental research. We use simulation of attacker isolation on cloud-native systems using docker, integrate a virtual honeypot OS (Cowrie) to record logs, following which threat modeling on the proposed architecture has been used to qualitatively analyze and propose a general framework for its implementation. Potential benefits of the proposed framework include risk mitigation, cost-effectiveness in terms of memory and process, vendor agnostic, and finally, it also addresses moving target defenses i.e, continuously shifting the configuration of the underlying system making it harder to comprehend the honeynet implementation. Index Terms—Honeynet, Honeypot, Cloud Computing, Net- work Security, Kubernetes I. I NTRODUCTION With highly interconnected computing systems running critical processes in today’s world, the need to ensure their security and reliability is higher than ever before. Amongst numerous methodologies in practice, honeypots represent an important part of proactive defense strategies. They aim to detect, contain and possibly deflect unauthorized access to a system [1]. They employ vulnerable services as baits to entice malicious actors to target a seemingly valuable resource. However, these vulnerable systems do not actually cause harm to production services, nor disclose any confidential data. When an attacker manages to infiltrate a system that has a honeypot installed, their actions are logged for analysis. Fake services and assets are placed within the system to lure them. In the process of attempting to gain access to these decoys, attackers tend to reveal their tactics, techniques, and procedures (TTPs). Since honeypots are not meant to serve any purpose as such, any inbound or outbound traffic can be classified as potentially suspicious. Analyzing their behavior helps security analysts determine and triage the attack vectors that are most likely to affect the production systems as well. Knowing the attackers’ Modus Operandi can also help predict further actions by mapping the gathered knowledge with Advanced Persistent Threat (APT) reports from frameworks such as MITRE ATT&CK. Traditional honeypots have usually been focused on host- based attacks. However, since most real-world systems involve various interconnected hosts, there is a need to imitate the entire network when designing deceptive environments. Hon- eynets aim to serve this purpose and function by modeling networks with multiple hosts, each running an instance of a honeypot software along with the fake applications [2]. These generate collective intelligence on the attacker’s methods and help track their penetration. Honeynets are usually isolated from the rest of the network by physical or virtual means using techniques such as VLANs, firewalls, and virtualization. This contains their lateral movement and also helps to keep the attackers diverted from pursuing real targets. Due to their com- prehensiveness in replicating most aspects of the production environment, honeynets are categorized as high-interaction honeypots [3]. On the contrary, low-interaction honeypots are much simpler, and easier to deploy and maintain. However, their presence can be easily detected and often does not present a holistic environment where the actual applications and security functions can be tested. While honeynets are difficult to detect, they are also difficult to maintain and costly to scale. Setting up a production- like network might require multiple hosts and services to be maintained. The issue is exacerbated due to the complex nature of modern applications. They can include numerous cloud- native micro-services, distributed pipelines, persistent data stores, and legacy monoliths. They can also be deployed in various settings. A few services could reside on-premise while the remaining ones are in a multi-cloud environment [4]. Such hybrid infrastructures make it complex to set up and maintain comprehensive honeynets. Even within a given environment, different types of virtualization technologies are used to au- tomate and orchestrate deployments. Cloud computing tools such as Kubernetes enable this agile usage of the underlying infra. On the networking side, Software Defined Networking (SDN) has been revolutionizing traditional networks for quite some time. Together with Network Function Virtualization (NFV), it is possible to represent and manage all the way