Estimating the Cost of Throttled Execution in Time Warp zy Samir R. Das Division of Computer Science The University of Texas at San Antonio San Antonio TX 78249-0667 Abstract zyxwvuts Over-optimistic execution has long been identified as a major performance bottleneck in Time Warp based parallel simulation systems. An appropriate throttle or control zyxwvutsrq of optimism can improve performance by re- ducing the number of rollbacks. However, the design of an appropriate throttle is a dificult task, as correct computations on the critical path may be blocked, thus increasing the ovemll execution time. In this paper we build a cost model f o r throttled execution that involves both rollback probability and probability for an event computation being on the critical path. The model can estimate an appropriate size of time window for a throt- tled execution using statistics collected from the purely optimistic execution. The model is validated by an ex- perimental study with a set of synthetic workloads. 1 Introduction The Time Warp [ll] protocol for parallel discrete event simulation (PDES) has shown a lot of promise to exploit the parallelism in a simulation model without requiring intricate model specific information. How- ever, it is prone to inefficient execution in many situ- ations due to over-optimistic behavior [4], when some logical processes (LPs) operate at a much larger simu- lation time than others. Over-optimism may cause long and/or cascaded rollbacks [12] and the Time Warp sys- tem may spend a considerable amount of time in rolling back incorrect computation. Over-optimism may lead to other performance problems zyxwvut as well. For example, over-optimisticLPs may consume memory resources at an uncontrollable rate, making it impossible to com- plete the simulation with a finite amount of memory. Even if sufficient memory can be provided, memory management overheads may dominate [3, zyxwvuts 41. Thus, a conventional wisdom in PDES community has been to study mechanisms that exhibit a “con- trolled” form of optimism, and exploit the advantages of the optimistic execution without its liabilities. In the past, several mechanisms to throttle Time Warp have been suggested Important examples include, (i) lim- iting all event computations within a simulated time window above the global virtual time (GVT) [14, 191, (ii) rolling back all processes to GVT (or close to GVT) at stochastically selected intervals in real time [13], (iii) not sending messages unless they are guaranteed to be correct zyxwv , thereby eliminating the need for anti-messages [5, 171 (this is also known as risk-free computation), (iv) bounding the total amount of memory that can be allocated to the Time Warp system using memory management protocols like cancelback [3,4], (v) limit- ing the number of events each LP may execute beyond GVT [18]. All these mechanisms have been shown per- form better than the “purely optimistic” Time Warp for certain simulation models. In general, it is believed that an appropriate throttling of Time Warp execu- tion (i.e., blocking one or more LPs even if they have unprocessed events in their future event list) has a strong potential for improved performance. However, the throttle must be applied with caution. As observed in [15], both purely optimistic Time Warp and Time Warp with an adaptive throttle can arbitrarily outper- form each other under specific circumstances. Thus it is imperative to study the appropriateness and amount of throttle required for the best possible performance. With this goal, we develop a cost model for esti- mating the benefits of throttling, which can be used to design time window based throttling. The model uses monitored statistics about the rollback and event commitment behavior and computes estimates for the rollback probability and the critical path. These esti- mates are used to construct an appropriate size of the time window to throttle the execution. The rest of the paper is organized as follows. In Section 2, we provide a background of the problem and relate our approach to other work in this direction. Section 3 describes the cost model. Section 4 validates the model using an ex- perimental study. Section 5 concludes the research and points to future work. 2 Background and Related Work Limiting optimism (or throttled execution) can be beneficial as it can potentially reduce rollbacks, and hence, rollback related costs (state restoration, sending antimessages, message cancellations at destinations). It can also reduce costs related to virtual memory man- agement [4]. On the other hand, limited optimism can be potentially harmful in case it blocks correct com- putations that may affect the critical path’ of the sim- ‘Informally, an event is on the critical path if any delay in executing it increases the total time to complete the parallel 186 1087-4097/96 $5.00 Q 1996 IEEE