Overcoming the obfuscation of Java programs by identifier renaming S. Cimato, A. De Santis, U. Ferraro Petrillo * Dipartimento di Informatica ed Applicazioni, Universita ´ degli Studi di Salerno, Via S. Allende, 84081 Baronissi (Salerno), Italy Received 3 August 2004; received in revised form 4 November 2004; accepted 14 November 2004 Available online 8 January 2005 Abstract Decompilation is the process of translating object code to source code and is usually the first step towards the reverse-engineering of an application. Many obfuscation techniques and tools have been developed, with the aim of modifying a program, such that its functionalities are preserved, while its understandability is compromised for a human reader or the decompilation is made unsuc- cessful. Some approaches rely on malicious identifiers renaming, i.e., on the modification of the program identifiers in order to intro- duce confusion and possibly prevent the decompilation of the code. In this work we introduce a new technique to overcome the obfuscation of Java programs by identifier renaming. Such a tech- nique relies on the intelligent modification of identifiers in Java bytecode. We present a new software tool which implements our technique and allows the processing of an obfuscated program in order to rename the identifiers as required by our technique. Moreover, we show how to use the existing tools to provide a partial implemen- tation of the technique we propose. Finally, we discuss the feasibility of our approach by showing how to contrast the obfuscation techniques based on malicious identifier renaming recently presented in literature. Ó 2004 Elsevier Inc. All rights reserved. Keywords: Java obfuscation; Program protection; Decompilation 1. Introduction The diffusion of Java has deeply affected the way simple programs or complex applications are developed and distributed. The Java platform comes with a rich set of function libraries which ease the task of software developers and make the language well suited for the development of programs in several application areas. Furthermore, Java applications are composed of dynam- ically loaded pieces of code which can be downloaded across the network and linked at runtime as required, providing an extensible programming environment. Another characteristic of the Java language which contributed to its wide diffusion is its portability: Java programs are compiled into a neutral platform format, the bytecode, which can be executed by a Java Virtual Machine (JVM) running on the targeted platform. Since much of the information needed for the execution of a bytecode is stored as symbolic references and the JVM has been designed with a very simple architecture, the bytecode format is relative simple compared to the com- plexity of the machine code executed by a real micropro- cessor. Furthermore, a large source of documentation is available for both the Java Virtual Machine and the Java language. These facts make the decompilation pro- cess very easy and constitute a threat for the interests of software developers and companies investing money in the distribution of Java based software. Decompilation is the process of translating object code to source code and is relevant to the security of a Java application for a number of reasons. Indeed, reverse engineering an executable code favors software 0164-1212/$ - see front matter Ó 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.jss.2004.11.019 * Corresponding author. Tel.: +39 089965416; fax: +39 089965272. E-mail addresses: cimato@dia.unisa.it (S. Cimato), ads@dia. unisa.it (A. De Santis), umbfer@dia.unisa.it (U. Ferraro Petrillo). www.elsevier.com/locate/jss The Journal of Systems and Software 78 (2005) 60–72