Appeared in the Proceedings of the First Annual IEEE/ACM International Symposium on Code Generation and Optimization, 27-29 March 2003, San Francisco, California - 1 - © 2003 IEEE The Transmeta Code MorphingSoftware: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, Jim Mattson Transmeta Corporation, 3990 Freedom Circle, Santa Clara, CA 95054 Abstract Transmeta’s Crusoe microprocessor is a full, system- level implementation of the x86 architecture, comprising a native VLIW microprocessor with a software layer, the Code Morphing Software (CMS), that combines an in- terpreter, dynamic binary translator, optimizer, and run- time system. In its general structure, CMS resembles other binary translation systems described in the litera- ture, but it is unique in several respects. The wide range of PC workloads that CMS must handle gracefully in real-life operation, plus the need for full system-level x86 compatibility, expose several issues that have received little or no attention in previous literature, such as excep- tions and interrupts, I/O, DMA, and self-modifying code. In this paper we discuss some of the challenges raised by these issues, and present the techniques developed in Crusoe and CMS to meet those challenges. The key to these solutions is the Crusoe paradigm of aggressive speculation, recovery to a consistent x86 state using unique hardware commit-and-rollback support, and adaptive retranslation when exceptions occur too often to be handled efficiently by interpretation. 1 Introduction Transmeta’s CrusoeVLIW processor and CMS [20] present an approach unique among commercial architectures: a microprocessor system with an internal The authors warmly acknowledge the numerous Transmeta engineers who designed and implemented the Crusoe Code Morphing Software and processor. This paper is based on their excellent work. Email contacts: dehnert@transmeta.com, grant@transmeta.com, and rjohnson@transmeta.com. VLIW instruction set architecture (ISA) with little resemblance to the external ISA (x86) that it presents to users. This approach allows a simple, compact, low- power microprocessor implementation, with the freedom to modify the internal ISA between generations, while supporting the broad range of legacy x86 software available. Producing robust runtime performance comparable to competing x86 implementations requires that CMS deal effectively with a number of difficult problems that have usually been ignored in the literature on binary translation and dynamic optimization. In this paper, we will sketch the structure of CMS, but our focus will be on several of the challenges it faced that set it apart from other systems described in the literature, and on the solutions we implemented. These challenges are natural consequences of CMS objectives: CMS must faithfully implement the complete x86 architecture: all instructions (including memory- mapped I/O), architectural registers, and complete exception behavior. CMS can make no assumptions about the operating system running on the processor and cannot depend on information or other assistance from the system. It is a system-level implementation, not application-level, and even executes the BIOS code. One consequence is that it does not have access to the executable files of the applications it runs; all translation is done on-line as the target software executes. CMS must provide robust performance for a wide variety of systems and applications, ranging from games and media processing to desktop productivity and server applications. This requires dealing with unpleasant realities like self-modifying code and precise exceptions. It is important to note in this regard that CMS is not a migration tool – unlike past commercial systems, CMS is not an interim solution to be used during transition of the code base to a new architecture, and cannot deal with unusual but