HPRA: A Pro-Active Hotspot-Preventive High-Performance Routing Algorithm for Networks-on-Chips Elena Kakoulli † , Vassos Soteriou † , Theocharis Theocharides ‡ † Department of EECEI ‡ KIOS Research Center, Department of ECE Cyprus University of Technology University of Cyprus {elena.kakoulli,vassos.soteriou}@cut.ac.cy ttheocharides@ucy.ac.cy Abstract—The inherent spatio-temporal unevenness of trafﬁc ﬂows in Networks-on-Chips (NoCs) can cause unforeseen, and in cases, severe forms of congestion, known as hotspots. Hotspots reduce the NoC’s effective throughput, where in the worst case scenario, the entire network can be brought to an unrecoverable halt as a hotspot(s) spreads across the topology. To alleviate this problematic phenomenon several adaptive routing algorithms employ online load-balancing functions, aiming to reduce the possibility of hotspots arising. Most, however, work passively, merely distributing trafﬁc as evenly as possible among alternative network paths, and they cannot guarantee the absence of network congestion as their reactive capability in reducing hotspot formation(s) is limited. In this paper we present a new pro-active Hotspot-Preventive Routing Algorithm (HPRA) which uses the advance knowledge gained from network-embedded Artiﬁcial Neural Network-based (ANN) hotspot predictors to guide packet routing across the network in an effort to mitigate any unforeseen near-future occurrences of hotspots. These ANNs are trained ofﬂine and during multicore operation they gather online buffer utilization data to predict about-to-be-formed hotspots, promptly informing the HPRA routing algorithm to take appropriate action in preventing hotspot formation(s). Evaluation results across two synthetic trafﬁc patterns, and trafﬁc benchmarks gathered from a chip multipro- cessor architecture, show that HPRA can reduce network latency and improve network throughput up to 81% when compared against several existing state-of-the-art congestion-aware routing functions. Hardware synthesis results demonstrate the efﬁcacy of the HPRA mechanism. I. I NTRODUCTION Networks-on-Chips (NoCs) [5] are the interconnect of preference in state-of-the-art multicore Systems-on-Chips (SoCs) [2] and Chip Multiprocessors (CMPs), such as Intel’s 48-core Single-Chip Cloud computer [13]. NoCs are replacing shared-communication mediums as they scale efﬁciently with increasing topology sizes, and their attributes of expandability, modularity, ability to tolerate faults, and energy efﬁciency, aid in quickly designing, testing, and verifying ultra high-performance multicore computing systems. The inherently unpredictable application trafﬁc patterns can cause hotspot formation(s), temporally and spatially across an NoC topol- ogy. This adverse phenomenon of trafﬁc hotspots is caused as NoC routers or modules in a multicore system occasionally receive packetized trafﬁc from other network element producers at a faster rate than they can actually eject this trafﬁc, as interconnecting links and input/output ports are bandwidth-restricted, and as routed ﬂits constantly compete for network resources (buffers, channels, etc) [20]. Even a single trafﬁc sender or receiver can cause a hotspot. Hotspots can also be produced by factors such as the lack of trafﬁc balancing under oblivious routing algorithms, non-optimal application mapping onto a multicore chip, application migration, and due to network-resource demands that unpredictably occur dynamically [3]. Wormhole Flow-Control (WFC), employed in most NoCs [6], where packetized messages are broken down into smaller logical units called ﬂits, in an effort to save on buffer sizing requirements, intensiﬁes this detrimental effect of hotspots onto the performance of NoCs. Under WFC, the spreading of packets in a pipelined mode across several routers, as ﬂits advance towards their destination, produces backpressure at upstream buffers causing them to quickly ﬁll-up in a domino-style mode. Hence, a hotspot(s) can quickly span several portions of the topology at a time, causing further message blocking to propagate spatially across several routers. This NoC resource over-utilization, can produce irreversible trafﬁc blockage which may force the entire NoC to stall indeﬁnitely, under which state the NoC becomes inoperable. Hotspot formations are especially unpredictable in general-purpose best-effort parallel on-chip systems such as CMPs, which are considered in this paper, where application patterns cannot be pre-determined and are highly spatio-temporally variable during system operation, unlike in special-purpose SoCs where trafﬁc patterns may be known a-priori to system operation [31]. Even under the use of load-balancing adaptive routing func- tions [16], substantial effective throughput degradation in an NoC can be observed. The development of congestion-management techniques as a means to safeguard the scalability of NoCs and hence the performance sustainability of their hosting general-purpose CMPs and application-driven SoCs, has been identiﬁed as a major research challenge in a number of recent signiﬁcant surveys [3], [19]. Such new schemes will enable designers and architects to lay the roadmap in future multicore chip design - current techniques such as dynamic congestion management in the form of adaptive routing protocols [6], [8], [14], [17], [18], [26], application scheduling [3], and the addition of extra buffering at router input ports [21] to house delayed ﬂits in an attempt to improve NoC throughput in the presence of bursty trafﬁc that may cause hotspots to form, are not always sufﬁcient as NoC congestion is an complex and unpredictable phenomenon. In this article we present a new congestion-preventive pro-active routing function, termed Hotspot-Preventive Routing Algorithm (HPRA). Instead of passively measuring current network statistics, such as link and buffer utilization, in attempting to reactively balance- out trafﬁc to improve or sustain network throughput, like most of existing routing algorithms [6], HPRA pro-actively prevents the unforseen formation of NoC hotspots or elevated congestion that may occur in the near future during network operation. This pro-active hotspot prevention is achieved with the use of advance information sourced with the use of Artiﬁcial Intelligence (AI) principles that are utilized during network operation to continuously predict the formation of trafﬁc hotspots or congestion. AI principles are chosen because of their adaptability to changing trafﬁc conditions and their ability to learn about small network spatio-temporal variations which can lead to online congestion and hence build on improving their ability to forecast the next hotspot occurrence in advance. An NoC-embedded Artiﬁcial Neural Network-based (ANN) hard- ware mechanism, from our previous work [16], is used in dynamically foreseeing these potential hotspot formations. Here, the routing algorithm utilizes this advance information, to partially or completely throttle hotspot-destined trafﬁc, gradually allowing portions or the entirety of such trafﬁc to reach their destinations, while continuously balancing-out trafﬁc that is not hotspot-destined. The latter trafﬁc category is balanced spatially across the topology via the use of real- time statistics gathered from the network, that are used to choose 249 978-1-4673-3052-7/12/$31.00 ©2012 IEEE