Partially Synchronous Overlays: Issues and Challenges Jawwad Shamsi, Chunbo Chu and Monica Brockmeyer Wayne State University {jshamsi, chunbo, mbrockmeyer}@wayne.edu Abstract The main advantage of synchronous systems over asynchronous systems is that timing and reliability guarantees can be established ensuring a strong programming model for many applications. These strong guarantees are not provided by the current model of the Internet. However the Internet possesses some synchronous properties which can be exploited to observe partial synchrony, leading to a stronger programming model. In this paper, we argue the importance of synchrony and describe mechanisms which can be used for the construction of partially synchronous overlays on the Internet. Keywords: Partial Synchrony, Internet Systems, Overlay Networks, Network Monitoring. 1. Introduction The Internet is generally modeled as an asynchronous, best effort communication system, in which guarantees of reliable communication cannot be established. Although this uncertain behavior has played an integral part in the scalable and wide scale deployment of the Internet, it has also highlighted many restrictions for various distributed applications. Several distributed applications require nodes to have access to information about time which can be used to synchronize their activities or order their events. Distributed algorithms such as consensus [14] [25] and leader election in a group [7] require the guarantees of synchronous system for their proper solution. Further, a synchronous system is likely to provide more precise knowledge of system states and efficient evaluation of predicates for global monitoring. Synchrony is also beneficial in applications where message delivery and reliability is essential, including e-commerce transactions and session management techniques such as Kerberos [20]. Many solutions under the synchronous model exhibit better time complexity than those constructed for asynchronous systems [2]. Additionally applications This material is based upon work supported by the National Science Foundation under CAREER grant ANI-0347222. deployed on synchronous systems can provide improved timing assumptions and failure semantics. The synchronous model requires that communication latency, clock drift and processor speeds be bounded for all the nodes of the system [14]. The performance of the distributed system can be increased significantly by providing these three guarantees. For instance, implementation of upper bound on communication latency can provide better time out and failure detection characteristics, required by many applications. Similarly, synchronizing clocks can provide suitable mechanism for message ordering or synchronizing activities Due to the dynamic and scalable network conditions in the Internet, these guarantees cannot be established in the current system model of the Internet. Network traffic in the Internet is affected by different factors such as transient and persistent congestion, route changes and queuing delays. Therefore it is impossible to establish bounds on clock drift and communication delay for every host or every communication channel in the Internet. The desire for a synchronous communication model for the Internet is intensified by its increasing and diversified use. What is needed is an efficient and scalable model which can promote synchrony among node members and can also provide guarantees of synchronous system for distributed applications. Our thesis is that it is possible to construct overlay networks on the Internet which exhibit some synchronous behavior. While the precise synchronous behavior will vary, these overlays will have communication limits which are known but do not hold indefinitely such that the periods of synchrony are separated by periods of asynchrony. The overall goal of this project is to explore the synchronous characteristics of the Internet and develop a framework upon which partially synchronous overlays can be constructed. We have previously developed a scheme for Tactical Construction of Overlays [24]. We expect that such a system can be beneficial for the construction of partially synchronous overlays. In this paper we present a model for the construction of partially synchronous overlays. We specifically address establishing upper bound on communication latency,