Safe Upgrading without Restarting Miles Barr and Susan Eisenbach Department of Computing Imperial College London, Great Britain, SW7 2BZ email: sue@doc.ic.ac.uk, miles@milesbarr.com Abstract The distributed development and maintenance paradigm for component delivery is fraught with problems. One wants a relationship between developers and clients that is au- tonomous and anonymous. Yet components written in lan- guages such as C++ require the recompilation of all depen- dent subsystems when a new version of a component is re- leased. The design of Java’s binary format has side-stepped this constraint, removing the need for total recompilation with each change. But the potential is not fulfilled if pro- grams have to be stopped to swap in each new component. This paper describes a framework that allows Java pro- grams to be dynamically upgraded. Its key purpose is to al- low libraries that are safe to replace existing libraries with- out adversely affecting running programs. The framework provides developers with a mechanism to release their li- braries and provides clients with the surety of only upgrad- ing when it is safe to do so. 1 Introduction The advent of the Internet has added its own problems for software maintenance. These problems cannot be prevented or solved directly because they arise out of inconsistencies between dynamically loaded packages that are not under the control of a single system adminis- trator and so cannot be addressed by current con- figuration management techniques [23]. Software has moved away from the stand alone applica- tions of pre-Internet days. Client/server applications are the norm and the stand alone applications that exist today are expected to undergo regular upgrades. Software develop- ment itself is done in distributed environments. With the universal availability of the Internet we can now alter end-user software, ensuring that each client is using the latest version. Developers can work across the globe on the same project, or on different projects sharing the same resources. Unfortunately, the upgrading process may not go smoothly. Upgrades can be incompatible, or may introduce subtle errors that don’t appear until later in the program’s life. There is scope for a more rigorous upgrading process. Even if the upgrade itself is fine, the upgrade process itself frequently causes problems. Usually when a compo- nent is replaced the program (typically a server) needs to be stopped and restarted. The maintainer downloads the new version, tests it, halts the server and installs it. To fully automate the process, especially if continuous execution is required, dynamic upgrading is necessary. Problems arises when users are developers and the new component is a library. The Java Language Specifica- tion [11] defines changes as binary compatible if they can be made to programs without introducing link time errors. Making sure the component is binary compatible is only one part of the problem. Developers could be located any where in the world, almost certainly out of the control of the component author. We need a framework that will allow de- velopers to automagically upgrade to the latest version. If the new component is binary compatible, it replaces the old one and the developer needn’t do any work. We provide a framework to allow the safe dynamic upgrading of software across the Internet. The problems of distributed software maintenance are encountered frequently. For example, a toolkit supplier may release software when it is barely out of alpha testing due to time constraints. So users of the libraries will get frequent upgrades, often resulting in systems breaking. Inevitably an upgrade to an incompatible version will be needed, but con- siderable time will be lost simply finding out if a new ver- sion will cause problems. With our framework upgrades to incompatible versions could be scheduled so that when the site goes live it could automatically switch over to newer versions of the toolkit to enhance performance. If the demands of distributed software maintenance were not enough, designers of new software regularly test the 1