Proceedings of the 2009 Winter Simulation Conference
M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds.
SURVIVABILITY MODELING WITH STOCHASTIC REWARD NETS
Poul E. Heegaard
Department of Telematics
Norwegian University of Science and Technology (NTNU)
Trondheim, N-7491, Norway
Kishor S. Trivedi
Pratt School of Engineering
Duke University,
Durham, NC 27708, USA
ABSTRACT
Critical services in a telecommunication network should survive and be continuously provided even when undesirable events
like sabotage, natural disasters, or network failures happen. The network survivability is quantified as defined by the ANSI
T1A1.2 committee which is the transient performance from the instant an undesirable event occurs until steady state with
an acceptable performance level is attained. Performance guarantees such as minimum throughput, maximum delay or loss
should be considered.
This paper demonstrates alternative modeling approaches to quantify network survivability, including stochastic reward
nets and continuous time Markov chain models, and cross-validates these with a process-oriented simulation model. The
experience with these modeling approaches applied to networks of different sizes clearly demonstrates the trade-offs that
need to be considered with respect to flexibility in changing and extending the model, model abstraction and readability,
and scalability and complexity of the solution method.
1 INTRODUCTION
Our society is critically dependent on a wide variety of telecommunication services, and telecommunication networks and
services today are part of the national critical infrastructure that needs to be protected. Hence, evaluation of network
survivability is of outmost importance under a variety of threats, like attacks, accidents, and failures, that may cause minor
or major service degradations. Specifically, survivability is quantified by the transient performance after an undesired event
has occurred, as specified by (ANSI T1A1.2 Working Group on Network Survivability Performance 2001).
In a multi-service telecommunication network it is essential to provide virtual connections between peering nodes ensuring
an overall good utilization of the network resources, and at the same time providing differentiated and guaranteed Quality
of Service and resilience requirements. The management of such virtual connections is a challenging task since virtual
connections need to be continuously operational without unnecessary delays and with priority to highly critical services even
when undesired events occur. Many management techniques exist that apply to different network layers, use pre-planned
or reactive techniques, and utilize various setup methods with different resource utilization on local or global operational
domain and scope of repair. See (Cholda et al. 2007) for an excellent classification of recovery techniques and recent state
of the art.
A model for the evaluation of the virtual connection management needs to consider both the behavioral as well as the
structural aspects of the system. This means that the model must capture how the performance of the virtual connection is
affected by routing and rerouting, by failures, by traffic load variations, by changes in network capacities, and by different
service requirements. Structural dependability models typically focus on the probabilities of terminal connectivity, while
behavioral models, e.g., as proposed in (Gan and Helvik 2006), take the network dynamics into account and provide steady
state service availability. Combining structural and behavior aspects is typically done using simulation models, stochastic
Petri nets such as stochastic reward nets, or continuous time Markov chains, e.g., using Markov dependability models or
queuing network models for performance analysis, or combined performance and dependability Markov reward type models
as in (Meyer 1980, Haverkort et al. 2001, Trivedi 2001).
807 978-1-4244-5771-7/09/$26.00 ©2009 IEEE