IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 8, OCTOBER 2010 1287 Rate Limiting in an Event-Driven BGP Speaker Euan D. Harris and Timothy G. Grif fin Abstract—Implementing BGP route processing in an event- driven manner appears to be advantageous in terms of scalability. However, the inter-domain routing system as a whole would be overwhelmed without some type of rate limiting on BGP update streams. At first glace, an event-driven, pipelined route processing model does not seem to fit well with the traditional timer-based way of implementing BGP rate-limiting. In this paper we present a lazy event-driven BGP route processing pipeline that easily accommodates rate limiting. Index Terms—BGP, Rate limiting, Event-driven architecture I. SPEEDING UP WHILE SLOWING DOWN? T HE BORDER Gateway Protocol (BGP) [1] implements the Internet’s interdomain routing system. Each BGP- speaking router, or speaker, is continually exchanging infor- mation with its neighbors about changes to the destinations to which it can send packets and the routes it uses to get there. However BGP is also just another application, competing for processor time and resources with all the other applications running on the same router. The way a router’s BGP software is implemented can have an enormous impact not only on that router’s individual performance, but on the performance of the Internet as a whole. Broadly speaking, two techniques can be used to implement a BGP speaker. Routes can be processed in a schedule-driven or in an event-driven manner. A given implementation may of course combine elements of both approaches. Each approach has its advantages and disadvantages, and it is beyond the scope of this paper to explore this design space. What we do seek to understand is how best to implement a BGP speaker when using an event-driven architecture. Implementing BGP route processing in an event-driven manner appears to be advantageous in terms of scalability. Schedule-driven systems accumulate work to be processed when a timer expires, artificially creating bursty demands on CPU time and network bandwidth. In comparison, event-driven systems can process routes as they arrive and, in general, avoid such periods of peak demand. Maximizing route throughput with minimal latency might seem like a reasonable router-oriented notion of optimality. However, the cost may be network-wide instability. To address this, RFC 4271 defines a rate limiting mechanism associated with the timer value MinRouteAdvertisementIntervalTimer (MRAI) — the minimum amount of time that must elapse between updates for the same destination. This mechanism could more accurately be called rate limiting with summarization since it allows a BGP speaker to hide many intermediate states Manuscript received 8 November 2009; revised 16 June 2010. The authors are with the University of Cambridge (e-mail: Timo- thy.Griffin@cl.cam.ac.uk). Digital Object Identifier 10.1109/JSAC.2010.101006. from its neighbors and thus greatly reduce the number of update messages propagated in the routing system as a whole. We believe rate limiting is essential for the network-wide performance of the BGP routing system — without it there could very well be an enormous explosion in the number of BGP update messages to be processed at each node. An event-driven, pipelined route processing model does not seem to fit well with the traditional timer-based way of implementing BGP rate-limiting. Adding a global timer to an event-driven system would negate the scalability benefits of the architecture. The fundamental contribution of this paper is to demonstrate an implementation of rate-limiting that is consistent with an event-driven architecture (Section II). There are essentially two kinds of event-driven systems — push-based or pull-based. In a push-based system, routes arriving from neighbors trigger a sequence of events that push the route though a processing pipeline. Indeed, in this ap- proach implementing rate limiting seems very unnatural since it requires “holding pens” to be introduced into the pipeline, thus violating the spirit of this style of implementation. However, in a pull-based (the term lazy is also used) system, routes are pulled through the processing pipeline only when needed. In this paper we describe a pull-based approach to implementing BGP route processing and argue that it is a simple and efficient implementation technique that accommodates rate limiting with little effort. The event-driven architecture seems to be implemented in only one open-source router, XORP [2], and so we use the XORP code base for our experiments. However, our approach is an architectural optimization not specific to XORP or to BGP. We believe our approach is applicable to any pipelined event-driven system that implements a hard-state path vector protocol. We use three different versions of XORP (Section III). Standard XORP is the XORP BGP speaker out-of-the-box. This is a push-based event-driven system, which does not in- clude an implementation of rate limiting. Our second version, called Scheduled XORP, simply adds a rate-limiting stage to the pipeline of Standard XORP. We find that this changes the runtime behaviour of the XORP speaker significantly, synchronising all route propagation to a single clock and losing the asynchronous nature of an event-driven system. Our third implementation, called Lazy XORP, implements a pull-based architecture, which required fundamental changes throughout the XORP pipeline. Overall, our experimental results (Section IV) suggest that Lazy XORP offers signifi- cant performance benefits over both Standard and Scheduled XORP. It also offers a number of other advantages, as dis- cussed in Section VI, since can easily accommodate separate per-peer MRAI timers and even different MRAI policies for different prefixes. 0733-8716/10/$25.00 c 2010 IEEE