Reprint: 12th European Simulation Multiconference, June 16-19, 1998 (Manchester, United Kingdon). A NALYSIS OF R EAL-TIME C ONCURRENT S YSTEMS M ODELS B ASED ON CSP USING STOCHASTIC PETRI NETS Frederick T. Sheldon Department of Computer Science The University of Colorado at Colorado Springs 1440 Austin Bluffs Parkway Colorado Springs, Colorado 80918, USA E-mail: Sheldon@cs.uccs.edu KEYWORDS Software engineering, Computer systems, Model analysis, Real-time, Performance and reliability analysis. ABSTRACT This paper addresses the real-time and reliability analysis of models for concurrent systems. Such models define independent entities that cooperate by explicit communication. Communications represent visible actions which, if they do not occur or are delayed beyond their deadline, will cause a failure to occur. This approach converts a formal functional system description into the information needed to predict its behavior as a function of observable parameters (i.e., topology, fault-tolerance, deadlines, communications and failure categories). The CSP-based models are translated into Stochastic Petri nets (SPNs) using our tool CSPN ( C SP-to- S tochastic P etri N ets). 1 CSPN uses algorithms which codify the canonical translations between essential CSP constructs and SPNs. The term "CSP-based" is used to distinguish between the exact notation of Hoare's original CSP and our textual representations which are similar to occam2. The CSP-based grammar is sufficient to preserve the structural properties of the original model. Consideration of other CSP properties (e.g., traces, refusal sets, livelock, etc.) are not precluded, however they are not considered here. A basic example is provided to illustrate the link between failure behavior and model characteristics (i.e., derivation of timing failure probability and reliability predictions as they would affect the cost for implementing a candidate model specification). 1. INTRODUCTION The source of errors in complex systems include a wide range of possible failure causalities (e.g., untested manufactured flaws, software design and implementation defects including timing errors, etc.). The most prevalent types are highly 1 CSP stands for Communicating Sequential Processes [Hoare 85] and the language is called PCSP for Parsable-CSP. dependent on the system, its operating environment, workload and system design including the integration and testing process. Furthermore, in critical systems, timing and performance issues must be considered. For example, embedded real-time systems (e.g., characterized by intense interaction with sensors and actuators) can control continuous reversible processes that typically possess the ability to tolerate brief periods of incorrect interaction either in values exchanged or the timing of exchanges (Shin and Kim 94). To consider such factors during the specification, analysis and design such systems is a difficult undertaking. Moreover, the systems designed and built today have greater functionality and higher performance (e.g., confidence gained from many operational hours, legacy systems that are evolved and have been refined). Whether these systems are more robust and more reliable is a less obvious question. In assuming they are more reliable, the question then becomes "...at what price?" The challenge is to develop effective methods and realistic models for reasoning about and evaluating such systems prior to building costly prototypes. As may be visualized in Figure 1, formal mathematically precise methods should be used to design such systems [Ostroff 92]. Given a formal model of a system and its external constraints (e.g., topology, communication, deadlines), what mechanisms are available for avoiding errors and how do they impact the behavioral aspects (i.e., performance and reliability) of the system [Kavi et al. 95]? As models are refined, the reliability and performance requirements can also be refined to reveal the trade-offs in design alternatives such as deciding what are the critical system elements, what features of the system should be changed to improve the system's reliability, or validating performance and reliability goals using stochastic models. A system is modeled using our CSP-based language. This grammar does not restrict the consideration of correctness properties, however only that the structural and functional properties be preserved. Once the model has been translated, it may be solved using Markov reward analysis [Ciardo et al. 91, 89, Chiola 89, Marsan et al. 84, 89, Tomek et al. 94, Bolch et al. 98]. This paper introduces the formalisms and gives a simple example to illustrate the approach: (1) conversion of the CSP model into SPNs, (2) identifying system failure modes by How do the externals impact the performance and reliability? Topology Fault tolerance Deadlines Communications Failure categories TrainXing = PROCESS Train = SEQ{InTransit(),{To AtIntersection(),{T PROCESS Gate = SEQ{Closed(),{ToGa Open(),{ToGate?Arr Mu.X{PAR{Train(),Gat e REQUIREMENTS (external constraints on the system) Resource allocation Convert a Fo Description into Needed to P Beha Formal Functional System Specification Figure 1. Predicting reliability of model specifications.