Manycast and Anycast Routing for Replica Placement in DataCenter Networks Ajmal Muhammad (1) , Nina Skorin-Kapov (2) , Lena Wosinska (3) (1) Department of Electrical Engineering, Link ¨ oping University, Link¨ oping, Sweden, ajmal@isy.liu.se (2) University Centre of Defence at the Spanish Air Force Academy, Santiago dela Ribera, Spain (3) School of Information and Communication Technology, KTH Royal Institute of Technology, Stockholm, Sweden Abstract Inter-datacenter networks need to support datacenters communication with the end-users, as well as content replication and synchronization between datacenters. This paper presents an integrated routing and replica placement strategy for WDM inter-datacenter networks resulting in reduced overall network resource consumption. Introduction As demands for cloud services proliferate, cloud Content Service Providers (CPSs) such as Amazon, Google, Facebook, Yahoo, etc, increasingly create, store and share massive amounts of content. Gener- ally, content is replicated across multiple geographically dispersed datacenters (DCs) interconnected via ultra- high capacity wavelength division multiplexing (WDM)- based inter-DC networks, either owned or leased by the CSP 1 . Such a distribution offers high service availabil- ity, even if failures occur in the network, and enables the end-users to connect to the most convenient (e.g., the closest) DC, thereby decreasing transit latency and/or amount of network resources needed to support the service requests. In the context of network planning, cloud services give rise to new trafﬁc patterns and huge trafﬁc volumes requiring tailored capacity-efﬁcient solu- tions. The trafﬁc associated with inter-DC networks can be classiﬁed into two broad categories. The ﬁrst en- compasses inter-DC trafﬁc which is related to con- tent replication and synchronization/updating between DCs 2 . Routing paradigms associated to this type of trafﬁc usually include unicast communication between DCs or multicast communication between a set of DCs where content replicas are stored 3 . The second type of trafﬁc is comprised of end-user-driven communication where users access cloud content and services typi- cally by applying the anycast routing paradigm 1 . In anycast routing, the user is served from any one of multiple DCs which supports the speciﬁed content/ser- vice. In other words, if replicas of a certain content are stored at multiple DCs, the user can connect to any one of them. This modiﬁes the assumptions used by classic routing strategies because the trafﬁc matrix is unknown, i.e., the destination node is not speciﬁed but can be any node within a subset. Both of the afore- mentioned trafﬁc categories imply high bandwidth re- quirements and their provisioning should be optimized to achieve resource-efﬁcient network planning. The bandwidth requirements for inter-DC content replication and synchronization increase with the num- ber of replicas and the network distance between the hosting DCs. On the other hand, a large number of geographically distributed replicas reduces end-user- driven trafﬁc bandwidth requirements (and transit la- tency) since the user can typically access a closer DC. Most previous works develop planning strategies by considering only one of these trafﬁc types. For ex- ample, the optimal placement of applications and their replicas over the available cloud infrastructure to mini- mize the data update cost was investigated in 3 . Con- tent placement with the goal to reduce the overall band- width requirements for a given set of user demands was studied in 4 . A disaster-resilient approach to content placement employing survivable anycast routing for op- timizing only user-driven trafﬁc was proposed in 1 . How- ever, to reduce the overall resource consumption more effectively, enhanced network planning strategies could be developed by jointly considering both trafﬁc types. Motivated by this observation, in this paper we solve the Routing, Update and Replica Placement (RURP) problem in WDM-based inter-DC networks. Given an inter-DC network topology and locations of the DCs, a set of contents to be replicated, the number of re- quired replicas for each content and a set of user de- mands for speciﬁc content, we determine a replica placement, as well as the associated routing and wave- length assignment solutions for both considered trafﬁc categories, i.e., for user demands and the associated updating/synchronization trafﬁc, to minimize the total overall capacity requirements. Each user demand is assumed to require a lightpath following the anycast routing paradigm, i.e., users can be connected to any DC hosting a replica of the desired content. Given a replica placement, the synchronization/update traf- ﬁc is assumed to utilize the multicast routing paradigm where all DCs hosting replicas are interconnected by a light-tree rooted at a main data center. However, to es- tablish the light-trees and perform replica placement si- multaneously, we apply the manycast routing scheme 5 , which requires the root node to be interconnected to only a subset of the destination set (i.e., the set of data DCs). The end nodes of the obtained manycast tree then determine the replica placement. We formulate the RURP problem as an Integer Linear Program (ILP) which solves the anycast (end user) and manycast (up- date trafﬁc) problems simultaneously to ﬁnd an optimal replica placement and the associated lightpaths and light-trees, which minimize the total capacity usage.