Fault Tolerance in Sensor Networks: Performance Comparison of Some Gossip Algorithms Marco Baldi, Franco Chiaraluce Dipartimento di Ingegneria Biomedica, Elettronica e Telecomunicazioni Università Politecnica delle Marche Ancona, Italy {m.baldi, f.chiaraluce}@univpm.it Elma Zanaj Departamenti i Elektronikes dhe Telekomunikacionit Fakulteti i Teknologjise se Informacionit Universiteti Politeknik i Tiranes Tirana, Albania ezanaj@fti.upt.al Abstract— The goal of this paper is to evaluate the efficiency of three versions of the well known gossip algorithm, namely: basic gossip, push-sum and broadcast, for the distributed solution of averaging problems. The main focus is on the impact of link failures that, reducing the network connectivity, decrease the convergence speed. As a similar effect occurs in non fully-meshed networks, because of a limited coverage radius of the nodes, a comparison is made between these two scenarios. The considered algorithms can require optimization of some share factors; to this purpose, we resort to simulations, but the conclusions achieved are confirmed through analytical arguments, exploiting the concept of potential function. Keywords-gossip; wireless sensor networks; fault tolerance I. INTRODUCTION Ad hoc wireless sensor networks are peer-to-peer systems formed by many small and simple devices, able to measure some quantities and to transmit their measured values to neighboring nodes. In such networks, nodes often communicate in order to merge their single contributions into a common result. Actually, this occurs in averaging problems, whose target is to calculate, in a distributed manner, the average value of a quantity of interest (e.g., temperature). Because of their features, these networks are suitable for many purposes, as environmental monitoring applications, allowing accurate control over large areas with favorable cost-to-benefit ratio [1]. Among these applications, however, hostile environments and scenarios of natural and man-made disasters represent great challenges, in which the network availability must be ensured, in spite of a number of possible impairments. Among the several protocols that are available nowadays for sensors communication, an increasing attention has been devoted to simple decentralized procedures based on the gossip principle, through which the computational burden is distributed among all nodes. The gossip algorithm was originally conceived for telephone networks [2], [3]. When gossip is applied in sensor networks, noting by x i and x j the local measures of the i-th and j-th nodes, an interaction among them updates one or both their values, that are then used for a subsequent interaction. The communication protocols can be managed either in a synchronous or in an asynchronous way, but the latter is more practical, because of its inherent simplicity. So, in this paper, we will limit to consider an asynchronous time model, in which any node has a clock which ticks independently at the times of a rate 1 Poisson process. Therefore, the inter-tick times at any node are rate 1 exponentials, independent across nodes and over time. Various implementations of gossip for averaging problems are possible; they aim at estimating the mean value of the sensed quantity. More precisely, let us denote by N the number of nodes and by 1 2 () [ ( ), ( ),..., ( )] T N k x k x k x k = x the vector of the estimates of all nodes after k clock ticks. The target of the algorithm is to find a reliable measure of the average value ave 1 (0) N i i x x N = = ∑ , in the shortest possible time, that is, maximizing the convergence speed. In a first implementation, called “basic gossip” in the following, an interaction among the i-th and j-th nodes produces as output x i (k + 1) = x j (k + 1) = x i (k)/2 + x j (k)/2, that is used by both nodes for the subsequent interaction [4]. A variant of this proposal consists in the so-called “push-sum” algorithm [5]. According with such protocol, a node forwards a share of its values, properly defined, to one of its neighbors, randomly selected, while keeping the remaining part. The performance of the push-sum algorithm depends on the choice of the share, that therefore represents a degree of freedom to optimize. Both the basic gossip and the push-sum algorithm are point-to-point protocols. However, in a wireless network, when a node transmits, all nodes in its coverage area can receive the transmitted data. This suggests to implement a broadcast algorithm to reduce the averaging time. Although the fundamentals of the considered protocols are well known and a number of papers on these topics already appeared in previous literature, several issues are still open. Among them, we have mentioned above the problem of optimizing the share values in the push-sum algorithm. In [5] the authors limited to say that the choice of shares may be deterministic or randomized, and may or may not depend on the time, without providing, however, a numerical evidence of