A Threshold Based On-line RWA algorithm with End-to-end Reliability Guarantees Zsolt P´ andi Department of Telecommunications Budapest University of Technology and Economics Budapest, Magyar tud´ osok krt. 2., 1117, Hungary Email: pandi@hit.bme.hu Marco Tacca and Andrea Fumagalli OpNeAR Research Lab The University of Texas at Dallas 2601 North Floyd Rd, Richardson, TX, 75080, USA Email: {mtacca,andreaf}@utdallas.edu Abstract— When reliability is a major concern in optical WDM networks, Shared Path Protection (SPP) schemes offer the potentially appealing feature of requiring fewer network resources than their counterpart Dedicated Path Protection (DPP) schemes. However, while sharing increases resource utilization, it reduces end-to-end reliability of connection demands. Moreover, the evaluation of connection demand reliability is easy with DPP: it depends only on the components used to route the connection demand. It is more complex with SPP, as sharing of protection resources introduces de- pendencies among different connection demands, and the reliability of a connection depends on the failure of components that are not used to route the connection demand as well. The possible occurrence of multiple failures exacerbates this problem significantly. For this reason, the simpler DPP schemes are often chosen, even though, as a result, network resources are not efficiently used. This paper presents a simple way to determine routing and wavelength assignment for connection demands subject to reliability constraints using SPP schemes. The solution is based on controlling the amount of sharing that can be done on protection resources. Controlling the amount of sharing is based on a threshold, assigned to the spare resource, which allows one to quickly determine whether or not (additional) sharing is permitted, i.e., the required reliability level of both the newly arrived demand and the already established demands are all guaranteed. When adopting the proposed solution the challenge is to select the threshold value that yields a good utilization of the network resources. When such value can be found, the proposed solution offers the possibility to achieve network utilization values that are possible only by means of SPP schemes, while at the same time requiring a simple computational technique that is comparable to that of DPP schemes. I. I NTRODUCTION Quality of Service (QoS) awareness gained vital impor- tance in service provisioning with the roll-out of applications that impose quality requirements on data transfer. Fulfilling these requirements also necessitates that the underlying networking technology is capable of offering end-to-end transport service at different reliability levels. Reliability can be improved by providing protection, that is, provisioning connections that require high reliability with additional standby resources that can be readily activated and used in case of a failure of one or more components. Protection schemes can utilize dedicated resources, e.g., Dedicated Path Protection (DPP) [1], or shared resources, e.g., Shared Path Protection (SPP) [1]. There is a clear tradeoff between dedi- cated protection schemes and shared protection schemes. As a general rule of thumb, dedicated protection requires more resources, but in turn it provides higher reliability. On the other hand, shared protection is more resource efficient but it provides lower reliability [2]. Both dedicated and shared protection schemes, have been an active area of research. [3] discusses the problem of double-failure resilient design for p-cycles and presents an approximation technique that improves survivability with respect to double failures. However, possibly different end- to-end reliability requirements are not taken into account during the computations. [4] argues that in some cases single-failure protection designs provide adequate protection even against double failures. Moreover, the paper also shows that by limiting the extent of resource sharing double-failure resilience can be improved. An intuitive explanation for the first claim is that a connection is only disrupted by a small fraction of the total possible failures of higher multiplicity. The second claim can be explained as follows: limiting sharing decreases the interference among different connections, so they will not block each other when trying to survive failures. The observations in [4] are made based on an analysis executed on designs optimized for single- failure and double-failure robustness; however, quantitative reliability guarantees are not addressed on a per connection basis. With the exception of [5], the above papers address reliability issues from the failure state-based requirement point of view, for example, provide protection against any single/double link failure. [5] introduces a technique that is in principle capable of providing a design that is robust in the presence of failures of any multiplicity and also addresses probabilistic demand survivability requirements, in other words, satisfies each demand’s Maximum Down- time Ratio (MDR). The drawback is that the complexity of the necessary computations increase with the failure multiplicity.