418 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 3, MARCH 2012 Intelligent Hotspot Prediction for Network-on-Chip-Based Multicore Systems Elena Kakoulli, Student Member, IEEE, Vassos Soteriou, Member, IEEE, and Theocharis Theocharides, Senior Member, IEEE Abstract —Hotspots are network-on-chip (NoC) routers or modules in multicore systems which occasionally receive packe- tized data from other networked element producers at a rate higher than they can consume it. This adverse phenomenon may greatly reduce the performance of NoCs, especially when wormhole flow-control is employed, as backpressure can cause the buffers of neighboring routers to quickly fill-up leading to a spatial spread in congestion. This can cause the network to saturate prematurely where in the worst scenario the NoC may be rendered unrecoverable. Thus, a hotspot prevention mechanism can be greatly beneficial, as it can potentially enable the intercon- nection system to adjust its behavior and prevent the rise of po- tential hotspots, subsequently sustaining NoC performance. The inherent unevenness of traffic patterns in an NoC-based general- purpose multicore system such as a chip multiprocessor, due to the diverse and unpredictable access patterns of applications, pro- duces unexpected hotspots whose appearance cannot be known a priori, as application demands are not predetermined, making hotspot prediction and subsequently prevention difficult. In this paper, we present an artificial neural network-based (ANN) hotspot prediction mechanism that can be potentially used in tan- dem with a hotspot avoidance or congestion-control mechanism to handle unforeseen hotspot formations efficiently. The ANN uses online statistical data to dynamically monitor the interconnect fabric, and reactively predicts the location of an about to-be- formed hotspot(s), allowing enough time for the multicore system to react to these potential hotspots. Evaluation results indicate that a relatively lightweight ANN-based predictor can forecast hotspot formation(s) with an accuracy ranging from 65% to 92%. Index Terms—Multiprocessor interconnection, neural network hardware, on-chip network, ultralarge-scale integration. I. Introduction N ETWORKS-ON-CHIP (NoCs) [10] have become the preferred communication backbone in high-performance multicore chips such as general-purpose chip multiproces- sors (CMPs) and application-specific systems-on-chips (SoCs). NoCs have already been utilized in ultrahigh-performance products such as in the Tilera TILE64 CMP [2] and the 48-core single-chip cloud computer (SCC) [23], hence becom- Manuscript received March 28, 2011; revised July 13, 2011; accepted August 29, 2011. Date of current version February 17, 2012. This paper was recommended by Associate Editor R. Marculescu. E. Kakoulli and V. Soteriou are with the Cyprus University of Technology, Limassol 3603, Cyprus (e-mail: elena.kakoulli@cut.ac.cy; vassos.soteriou@cut.ac.cy). T. Theocharides is with the University of Cyprus, Nicosia, Cyprus (e-mail: ttheocharides@ucy.ac.cy). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCAD.2011.2170568 ing functional and inbuilt components in massively parallel on-chip systems. In the commonly NoC-employed [3], [16] wormhole flow- control (WFC) [11], communication among the various on- chip components is organized in the form of packetized messages, of arbitrary length, which are further segmented into logical link-width chunks called flow-control units, or flits for short. However, the spreading of a packet in a pipelined fashion across several routers at a time makes WFC susceptible to packet blocking and possible indefinite stalling, as a prolonged delay at an intermediate router quickly forms backpressure which can be spread spatially across the network topology in the reverse direction of the packet’s traversal path. To alleviate this problem, designers usually employ several resource utilization-enhancing mechanisms, such as using virtual channels (VCs) at buffers [9], exploiting advance resource-reserving control packets [36], designing specialty architectures or schemes to guarantee communication rates [7], or, inefficiently, operating high-performance NoCs at relatively low utilization rates to avoid message blocking [35]. Hotspots 1 are NoC routers or modules in multicore systems which occasionally receive packetized traffic from remaining networked element producers at a rate faster than they can consume it, as interconnecting links and output (and input) ports are bandwidth-limited and as the traffic load distribution of actual applications is intrinsically uneven, such as at the bisection of 2-D mesh NoC topologies [32]. This consumer– producer gap is inherently unavoidable, especially in general- purpose multicore systems (i.e., CMPs), due to the diverse and unpredictable access patterns of applications which dictate network traffic to be uneven in distribution or posses a bursty or streaming nature across interconnected paths spanning portions of the network topology. Even a single router can cause a hotspot, and worse, a hotspot can appear even when using links of theoretically infinite bandwidth. Hotspots can also be caused by nonoptimal application mapping, lack of traffic balancing when using oblivious routing algorithms, application migration, and due to resource demands that occur unpredictably and dynamically [4], [34]. Hotspots have a spatial component when a subset of the routers receives the majority of the traffic and a temporal com- 1 The term “hotspot” may also refer to networked elements which possess a thermal profile that is higher than that of the network’s average temperature level, a direct consequence of localized network contention. 0278-0070/$31.00 c 2012 IEEE