Towards Integral Binary Execution: Implementing Oblivious Hashing Using Overlapped Instruction Encodings Matthias Jacob Nokia Research Center United Kingdom majacob@nokia.com Mariusz H. Jakubowski Microsoft Research Redmond, WA mariuszj@microsoft.com Ramarathnam Venkatesan Microsoft Research Redmond, WA (USA) venkie@microsoft.com ABSTRACT Executing binaries without interference by an outside adversary has been an ongoing duel between protection methods and at- tacks. Recently, an efficient kernel-patch attack has been pre- sented against commonly used self-checking code techniques that use checksumming ahead of execution. While methods based on self-modifying code can defend against this attack, such techniques depend on low-level architectural details and may not be practical in the long run. An alternative defense is to use oblivious hashing (OH). Instead of checking code integrity prior to execution, OH can verify untampered runtime behavior continuously. However, earlier OH approaches have some weaknesses, particularly with binary code: Physical instruction bytes cannot be easily checked during execution, and an attacker may be able to detect and remove OH checks, since OH alone does not provide tamper-resistance or obfuscation. In our approach, we deliberately overlap a program’s basic blocks so that they share instruction bytes. This increases tamper- resistance implicitly because malicious modifications affect mul- tiple instructions simultaneously. Also, our scheme facilitates explicit anti-tampering checks via injection of OH instructions overlapped with target code, enabling OH that can verify in- tegrity of both runtime state and executing instructions. Thus, our method addresses anti-checksum attacks without resorting to self-modifying code, and also extends OH to verify physical code, not only program state. In addition, overlapping facilitates resis- tance against disassembly and decompilation. Our approach works on processor architectures and byte-codes that support variable- length instructions. To our knowledge, this is the first technique that blends tamper-resistance into architecture and therefore sig- nificantly improves robustness of binaries. Categories and Subject Descriptors D.m [Software]: Miscellaneous—Software protection; C.5.0 [Computer Systems Organization]: Computer System Imple- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM&Sec’07, September 20–21, 2007, Dallas, Texas, USA. Copyright 2007 ACM 978-1-59593-857-2/07/0009 ...$5.00. mentation—General; D.2.11 [Software Engineering]: Software Architectures—Information hiding General Terms Algorithms, reliability, security Keywords Oblivious hashing, tamper-resistance, integrity checking, obfusca- tion, overlapped code, anti-disassembly 1. INTRODUCTION Protecting media players against deliberate attacks is becoming increasingly important on today’s Internet. Software pirates often compromise player software to remove copy-protection and wa- termarks from digital content. In addition, computer viruses turn normal PCs into malicious attack hosts that hit the net almost every day, thereby slowing down servers and network access. On open platforms such as PCs, virtual memory provides pro- tection among processes, but an adversary with access to the OS kernel is able to take over and tamper with any process. To pro- tect against these attacks in software, various methods for tamper- resistance have been developed (e.g., [8, 13, 26]). These turn a program P into an equivalent tamper-resistant program P ′ = O(P), which detects manipulation attempts. However, P ′ is commonly based on techniques that verify code integrity before execution takes place. With full access to the PC, an attacker can circumvent these protection schemes – e.g., by tampering with the code af- ter the integrity check takes place, or by executing tampered code while passing original code to checksum and hash routines [47]. Self-modifying code can protect against this type of attack [25], but may not always be viable, since low-level operations can be obstructed by specific architectural features (e.g., the execute-only bit on the x86 platform). In contrast to other checksumming techniques, oblivious hash- ing (OH) does not verify machine-code bytes, but computes input- specific hashes based on program state during execution [29, 14, 35]. After initializing a hash, OH typically updates it with results of variable assignments, as well as with unique identifiers based on executed branches. At various points in a program, OH verifies the current hash against a pre-computed correct hash. Alternately, OH may compare hashes computed by two or more individualized replicas of the same code. While OH has been implemented for high-level code [14] and Java byte-code [35], OH has seldom been investigated on low-