The Impact of Output Selection Function Choice on the Performance of Adaptive Wormhole Routing Loren Schwiebert and Renelius Bell Department of Electrical and Computer Engineering Wayne State University Detroit, MI 48202–3902 Email: loren@ece.eng.wayne.edu, bell@ernie.eng.wayne.edu Abstract Many adaptive routing algorithms have been proposed for wormhole-routed interconnection networks. Compara- tively little work, however, has been done on determining how the output selection function (routing policy) affects the performance of an adaptive routing algorithm. In this paper, we present a detailed simulation study of various selection functions for a fully adaptive mesh routing algo- rithm. The simulation results show that the choice of selec- tion function has a significant effect on the average mes- sage latency. Thus, a naive implementation of an adaptive routing algorithm may lead to poor performance. These selection functions are also compared with a theoretically optimal selection function [1]. We show that although the- oretically optimal, the actual performance of the optimal selection function is not best. An explanation and interpre- tation of the results is provided. 1 Introduction Wormhole routing [5] has become the switching tech- nique of choice in most modern distributed-memory and distributed-shared-memory multiprocessors. Wormhole routing moves messages through the network by dividing each message into flits, where a flit is typically a small mul- tiple of the width of a physical channel. The header flit of a packet contains the routing information and the data flits of the packet follow the header flit through the network. The major advantage of wormhole routing is that when the header arrives at an intermediate router, the router forwards the message header to a neighboring router as soon as an output channel the message can use is available. Since the flits of a message are forwarded as soon as possible, the message latency is largely insensitive to the distance between the source and destination. Virtual cut- through also allows messages to progress as soon as an This research was supported in part by National Science Foundation Grant #HRD-9450371. output channel is available and buffers packets only when the output channel is busy. Wormhole routing, however, requires only enough storage at each router to buffer a few flits, rather than the entire packet. The low minimum latency and modest buffering requirements account for the popularity of wormhole routing in distributed-memory multiprocessors. See Ni and McKinley [10] for a detailed explanation of wormhole routing. Wormhole routing is susceptible to contention even with moderate traffic, which results in higher message latency. Because message headers are forwarded immediately, a message can span many channels simultaneously. Further- more, a message that is blocked remains in the network and the message continues to hold all the channels it currently occupies. Thus, a message that traverses several channels could block many other messages. Dally [4] proposes a cost-effective method of reducing contention by allowing multiple virtual channels to share the same physical channel. A virtual channel is really just a buffer on a physical channel. By providing multiple virtual channels on the same physical channel, multiple messages can be multiplexed over the same physical channel and thus bypass blocked messages. Contention can be further re- duced by permitting a message to choose from among the multiple paths in the network between the source and desti- nation, which also reduces the message latency and makes better use of the network channels. Dally and Seitz [5] have shown, however, that a deadlock configuration can be formed if no restrictions are placed on the use of channels. Oblivious routing algorithms define a single path be- tween a source and destination. Adaptive routing algo- rithms, on the other hand, support multiple paths between a source and destination. Adaptive routing algorithms are ei- ther minimal or nonminimal. Minimal routing algorithms allow only shortest paths to be chosen, while nonminimal routing algorithms do not require messages to use only shortest paths. Adaptive routing algorithms can be fur- ther differentiated by the number of shortest paths allowed. 1