Manufacturing Resilient Bi-Opaque Predicates against Symbolic Execution Hui Xu ∗† , Yangfan Zhou ‡§ , Yu Kang , Fengzhi Tu , Michael R. Lyu ∗† Shenzhen Research Institute, The Chinese University of Hong Kong Dept. of Computer Science and Engineering, The Chinese University of Hong Kong School of Computer Science, Fudan University § Engineering Research Center of Cyber Security Auditing and Monitoring, Ministry of Education Abstract—Control-flow obfuscation increases program com- plexity by semantic-preserving transformation. Opaque predi- cates are essential gadgets to achieve such transformation. How- ever, we observe that real-world opaque predicates are generally very simple and engage little security consideration. Recently, such insecure opaque predicates have been severely attacked by symbolic execution-based adversaries and jeopardize the security of control-flow obfuscation. This paper, therefore, proposes symbolic opaque predicates which can be resilient to symbolic execution-based adversaries. We design a general framework to compose such opaque predicates, which requires introducing challenging symbolic analysis problems (e.g., symbolic memory) in each opaque predicate. In this way, we may mislead symbolic execution engines into reaching false conclusions. We observe a novel bi-opaque property about symbolic opaque predicates, which can incur not only false negative issues but also false positive issues to attackers. To evaluate the efficacy of our idea, we have implemented a prototype obfuscation tool based on Obfuscator-LLVM and conduct experiments with real-world programs. Our evaluation results show that symbolic opaque predicates demonstrate excellent resilience to prevalent symbolic execution engines, such as BAP, Triton, and Angr. Moreover, although the costs of symbolic opaque predicates may vary for different problem settings, some predicates can be very efficient. Therefore, our framework is both secure and usable. Users can follow the framework to introduce symbolic opaque predicates into their obfuscation tools and made them more powerful. I. I NTRODUCTION Obfuscation is a widely employed technique which protects software from reverse engineering. It transforms programs into unintelligible versions while preserving their original function- alities. Obfuscation can be achieved via lexical transformation, control-flow transformation, data-flow transformation, etc [1]. Such obfuscation transformation techniques are orthogonal to each other and can be employed simultaneously. This paper focuses on control-flow obfuscation, which increases software complexity (e.g., by adding bogus con- trol flows) against reverse control-flow analysis. Opaque predicates are essential gadgets to achieve such obfuscation transformation. An opaque predicate is a predicate whose value is known before obfuscation time but difficult to be deduced by reverse analysis. Because it holds some deterministic properties, we can employ opaque predicates to transform a program without changing its semantics. For example, we can add a bogus code block after a constantly false opaque predicate and guarantee the code block would never be executed. In practice, opaque constant (e.g., x 2 = -1) is the most prevalent type of opaque predicates adopted by obfuscation tools, such as Obfuscator-LLVM [2]. Although other approaches (e.g., unsolved conjectures [3]) may demonstrate better security, they are not widely adopted due to either implementation or performance issues [4]. Recently, the security of opaque predicates has been greatly challenged due to the development of symbolic execution techniques. Notably, Ming et al. have proposed an opaque predicate detection approach based on symbolic execution [5]; Yadegari et al. have demonstrated the effectiveness of deob- fuscation attacks based on symbolic execution [6]. Symbolic execution is a program analysis approach that models the conditions for executing alternative control flows. It attempts to find test cases that can satisfy such conditions. If a condition cannot be satisfied, it may indicate a bogus control flow or an opaque predicate. Symbolic execution-based attacks may not be new to the research community. But due to the development of symbolic execution techniques, such attacks become practical recently and jeopardize the robustness of obfuscated software. In this work, we propose a novel framework to manu- facture symbolic opaque predicates which are resistant to symbolic execution-based adversaries. A key procedure in our framework is to introduce challenging problems for symbolic execution to analyze, such as employing symbolic memory and parallel programming [7]. Moreover, we observe a bi-opaque property of such opaque predicates, i.e., it may either mislead an attacker into falsely recognizing an opaque predicate as a normal predicate, or to falsely recognizing a normal predicate as an opaque predicate. We have implemented a prototype tool based on Obfuscator- LLVM [2]. Our tool 1 automatically replaces the opaque predicates generated by Obfuscator-LLVM with symbolic opaque predicates in IR (intermediate representative) level. It employs a repository-based mechanism to manage different templates of symbolic opaque predicates. Currently, we have implemented several templates in the repository, which attack symbolic execution with symbolic memory, floating-point numbers, covert propagation, and parallel programming. The 1 Our project url is https://github.com/hxuhack/symobfuscator 666 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks 2158-3927/18/$31.00 ©2018 IEEE DOI 10.1109/DSN.2018.00073 Authorized licensed use limited to: Chinese University of Hong Kong. Downloaded on December 01,2020 at 14:52:11 UTC from IEEE Xplore. Restrictions apply.