Diar: Removing Uninteresting Bytes from Seeds in Software Fuzzing Aftab Hussain, Mohammad Amin Alipour ahussain27@uh.edu, maalipou@central.uh.edu University of Houston, Houston, TX, USA Abstract. Software fuzzing mutates bytes in the test seeds to explore different behaviors of the program under test. Initial seeds can have great impact on the performance of a fuzzing campaign. Mutating a lot of unin- teresting bytes in a large seed wastes the fuzzing resources. In this paper, we present the preliminary results of our approach that aims to improve the performance of fuzzers through identifying and removing uninterest- ing bytes in the seeds. In particular, we present Diar, a technique that reduces the size of the seeds based on their coverage. Our preliminary results suggest fuzzing campaigns that start with reduced seeds, find new paths faster, and can produce higher coverage overall. 1 Introduction Coverage-guided fuzzers, fuzzers for short, have become an important tool in testing software systems and uncovering bugs and software vulnerabilities. Due to their easy-to-use design and proven potential of finding bugs and vulnerabili- ties, they are increasingly being adopted in the industry, and popular coverage- guided fuzzers, like AFL [25] and AFL++ [6], are regularly used for testing applications at large companies. These fuzzers mutate bytes in the seeds to generate new tests, and coverage feedback steers this test generation. There is a large body of work that has been dedicated to improve the performance of the fuzzers. The majority of work in this area has been concerned with the mutation operators and scheduling in steering the test generation. However, seed selection has just recently gained some interest [10] and most papers treat that “casually” [12]. Seeds can impact the performance of fuzzing. While a small seed filled with bytes associated with interesting behavior can help fuzzers explore a larger state space of programs faster, a large seed with many uninteresting bytes, e.g., pay- load data in network protocols, that do not contribute to the interesting behav- iors of the program, will trap the fuzzers in a long sequence of futile mutations of bytes. As one solution to this program could be to remove such large seeds, we note that in some settings the seeds can be inherently large, e.g., object or media files – hence such solutions are inapplicable. Therefore, we need approaches to preprocess the seeds to identify uninteresting bytes and somehow exclude them from mutation. arXiv:2112.13297v1 [cs.SE] 25 Dec 2021