Dynamic Software Updates for Accelerating Scientific Discovery Dong Kwan Kim, Myoungkyu Song, Eli Tilevich, and Shawn A. Bohner Center for High-End Computing Systems (CHECS) Dept. of Computer Science, Virginia Tech Blacksburg, VA 24061, USA {ikek70,mksong,tilevich,sbohner}@cs.vt.edu Abstract. Distributed parallel applications often run for hours or even days before arriving to a result. In the case of such long-running pro- grams, the initial requirements could change after the program has started executing. To shorten the time it takes to arrive to a result when running a distributed computationally-intensive application, this paper proposes leveraging the power and flexibility of dynamic software updates. In par- ticular, to enable flexible dynamic software updates, we introduce a novel binary rewriting approach that is more efficient than the existing tech- niques. While ensuring greater flexibility in enhancing a running program for new requirements, our binary rewriting technique incurs only negli- gible performance overhead. We validate our approach via a case study of dynamically changing a parallel scientific simulation. Key words: Dynamic Software Updates, Time-to-Discovery, Computationally- Intensive Applications, JVM HotSwap, Bytecode Enhancement 1 Introduction Scientific computing is an interdisciplinary research area that uses computer technologies to analyze mathematical models for computationally demanding problems, including forecasting the weather, predicting earthquakes, and simu- lating molecular dynamics. Despite the ever increasing computing power, scien- tific computing applications are often long-running, taking hours or even days to arrive to a result, due to the tremendous amounts of involved computations. An effective approach to reducing the computing time in scientific programs is parallel processing, particularly using compute clusters and computational grids. In a long-running application, the initial scientific requirements could change while the execution is in progress. To realize the changed requirements, a stan- dard approach requires stopping the running application, changing the code, and restarting the application. However, this maintenance approach does not utilize the computing resources most effectively, as it leads to repeating some of the computation. This work is concerned with perfective maintenance required to address changes in requirements rather than corrective maintenance required to address