HPRA: A Pro-Active Hotspot-Preventive High-Performance Routing Algorithm for Networks-on-Chips Elena Kakoulli , Vassos Soteriou , Theocharis Theocharides Department of EECEI KIOS Research Center, Department of ECE Cyprus University of Technology University of Cyprus {elena.kakoulli,vassos.soteriou}@cut.ac.cy ttheocharides@ucy.ac.cy Abstract—The inherent spatio-temporal unevenness of traffic flows in Networks-on-Chips (NoCs) can cause unforeseen, and in cases, severe forms of congestion, known as hotspots. Hotspots reduce the NoC’s effective throughput, where in the worst case scenario, the entire network can be brought to an unrecoverable halt as a hotspot(s) spreads across the topology. To alleviate this problematic phenomenon several adaptive routing algorithms employ online load-balancing functions, aiming to reduce the possibility of hotspots arising. Most, however, work passively, merely distributing traffic as evenly as possible among alternative network paths, and they cannot guarantee the absence of network congestion as their reactive capability in reducing hotspot formation(s) is limited. In this paper we present a new pro-active Hotspot-Preventive Routing Algorithm (HPRA) which uses the advance knowledge gained from network-embedded Artificial Neural Network-based (ANN) hotspot predictors to guide packet routing across the network in an effort to mitigate any unforeseen near-future occurrences of hotspots. These ANNs are trained offline and during multicore operation they gather online buffer utilization data to predict about-to-be-formed hotspots, promptly informing the HPRA routing algorithm to take appropriate action in preventing hotspot formation(s). Evaluation results across two synthetic traffic patterns, and traffic benchmarks gathered from a chip multipro- cessor architecture, show that HPRA can reduce network latency and improve network throughput up to 81% when compared against several existing state-of-the-art congestion-aware routing functions. Hardware synthesis results demonstrate the efficacy of the HPRA mechanism. I. I NTRODUCTION Networks-on-Chips (NoCs) [5] are the interconnect of preference in state-of-the-art multicore Systems-on-Chips (SoCs) [2] and Chip Multiprocessors (CMPs), such as Intel’s 48-core Single-Chip Cloud computer [13]. NoCs are replacing shared-communication mediums as they scale efficiently with increasing topology sizes, and their attributes of expandability, modularity, ability to tolerate faults, and energy efficiency, aid in quickly designing, testing, and verifying ultra high-performance multicore computing systems. The inherently unpredictable application traffic patterns can cause hotspot formation(s), temporally and spatially across an NoC topol- ogy. This adverse phenomenon of traffic hotspots is caused as NoC routers or modules in a multicore system occasionally receive packetized traffic from other network element producers at a faster rate than they can actually eject this traffic, as interconnecting links and input/output ports are bandwidth-restricted, and as routed flits constantly compete for network resources (buffers, channels, etc) [20]. Even a single traffic sender or receiver can cause a hotspot. Hotspots can also be produced by factors such as the lack of traffic balancing under oblivious routing algorithms, non-optimal application mapping onto a multicore chip, application migration, and due to network-resource demands that unpredictably occur dynamically [3]. Wormhole Flow-Control (WFC), employed in most NoCs [6], where packetized messages are broken down into smaller logical units called flits, in an effort to save on buffer sizing requirements, intensifies this detrimental effect of hotspots onto the performance of NoCs. Under WFC, the spreading of packets in a pipelined mode across several routers, as flits advance towards their destination, produces backpressure at upstream buffers causing them to quickly fill-up in a domino-style mode. Hence, a hotspot(s) can quickly span several portions of the topology at a time, causing further message blocking to propagate spatially across several routers. This NoC resource over-utilization, can produce irreversible traffic blockage which may force the entire NoC to stall indefinitely, under which state the NoC becomes inoperable. Hotspot formations are especially unpredictable in general-purpose best-effort parallel on-chip systems such as CMPs, which are considered in this paper, where application patterns cannot be pre-determined and are highly spatio-temporally variable during system operation, unlike in special-purpose SoCs where traffic patterns may be known a-priori to system operation [31]. Even under the use of load-balancing adaptive routing func- tions [16], substantial effective throughput degradation in an NoC can be observed. The development of congestion-management techniques as a means to safeguard the scalability of NoCs and hence the performance sustainability of their hosting general-purpose CMPs and application-driven SoCs, has been identified as a major research challenge in a number of recent significant surveys [3], [19]. Such new schemes will enable designers and architects to lay the roadmap in future multicore chip design - current techniques such as dynamic congestion management in the form of adaptive routing protocols [6], [8], [14], [17], [18], [26], application scheduling [3], and the addition of extra buffering at router input ports [21] to house delayed flits in an attempt to improve NoC throughput in the presence of bursty traffic that may cause hotspots to form, are not always sufficient as NoC congestion is an complex and unpredictable phenomenon. In this article we present a new congestion-preventive pro-active routing function, termed Hotspot-Preventive Routing Algorithm (HPRA). Instead of passively measuring current network statistics, such as link and buffer utilization, in attempting to reactively balance- out traffic to improve or sustain network throughput, like most of existing routing algorithms [6], HPRA pro-actively prevents the unforseen formation of NoC hotspots or elevated congestion that may occur in the near future during network operation. This pro-active hotspot prevention is achieved with the use of advance information sourced with the use of Artificial Intelligence (AI) principles that are utilized during network operation to continuously predict the formation of traffic hotspots or congestion. AI principles are chosen because of their adaptability to changing traffic conditions and their ability to learn about small network spatio-temporal variations which can lead to online congestion and hence build on improving their ability to forecast the next hotspot occurrence in advance. An NoC-embedded Artificial Neural Network-based (ANN) hard- ware mechanism, from our previous work [16], is used in dynamically foreseeing these potential hotspot formations. Here, the routing algorithm utilizes this advance information, to partially or completely throttle hotspot-destined traffic, gradually allowing portions or the entirety of such traffic to reach their destinations, while continuously balancing-out traffic that is not hotspot-destined. The latter traffic category is balanced spatially across the topology via the use of real- time statistics gathered from the network, that are used to choose 249 978-1-4673-3052-7/12/$31.00 ©2012 IEEE