IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 7, JULY 2006 693 “It’s a Small World After All”: NoC Performance Optimization Via Long-Range Link Insertion Umit Y. Ogras, Student Member, IEEE, and Radu Marculescu, Member, IEEE Abstract—Networks-on-chip (NoCs) represent a promising solution to complex on-chip communication problems. The NoC communication architectures considered so far are based on either completely regular or fully customized topologies. In this paper, we present a methodology to automatically synthesize an archi- tecture which is neither regular nor fully customized. Instead, the communication architecture we propose is a superposition of a few long-range links and a standard mesh network. The few ap- plication-specific long-range links we insert significantly increase the critical traffic workload at which the network transitions from a free to a congested state. This way, we can exploit the benefits offered by both complete regularity and partial topology customization. Indeed, our experimental results demonstrate a significant reduction in the average packet latency and a major improvement in the achievable network through with minimal impact on network topology. Index Terms—Design automation, multiprocessor system-on- chip (MP-SoC), network-on-chip (NoC), performance analysis. I. INTRODUCTION C ONTINUOUS scaling of CMOS technology makes it pos- sible to integrate a large number of heterogeneous devices that need to communicate efficiently on a single chip. Large- scale integration of these diverse blocks calls for truly scal- able communication architectures. While the legacy bus-based communication architecture is the standard solution for on-chip communication, its poor scalability, both in terms of perfor- mance and power efficiency, makes it a poor choice for future systems-on-chip (SoCs). Therefore, it has been recently pro- posed to replace the custom (global) wires with structured net- works-on-chips (NoCs) [4], [12], [18], [21]. Regular NoC architectures based on grid-like (or two-dimen- sional (2-D) lattice) topologies as shown in Fig. 1 provide struc- tured global interconnects. This ensures well-controlled elec- trical parameters and reduced power consumption on the global wires. However, such architectures may suffer from long packet latencies due to the lack of fast paths between remotely situated nodes. Indeed, having to traverse many hops between any two remotely communicating nodes increases the message blocking probability. This makes the message latencies unpredictable and guaranteed service operation hard to achieve. Moreover, since most of the real-life applications have widely varying commu- nication requirements, such general-purpose platforms may be- Manuscript received July 1, 2005; revised January 16, 2006. This work was supported by the Marco Gigascale Systems Research Center (GSRC). The authors are with the Department of Electrical and Computer Engi- neering, Carnegie Mellon University, Pittsburgh, PA 15213-3890 USA (e-mail: uogras@ece.cmu.edu; radum@ece.cmu.edu). Digital Object Identifier 10.1109/TVLSI.2006.878263 come less attractive for application-specific designs that need to guarantee a certain level of performance. On the other hand, fully customized topologies [32], [35], [38] improve the overall system performance at the expense of altering the regularity of the grid structure. This results in global wires with widely varying lengths, performance, and power con- sumption. Consequently, better logical connectivity comes at the expense of a penalty in the structured wiring. Hence, usual problems like crosstalk, timing closure, and wire routing may undermine the advantages expected from customization. Be- sides these issues, the customized topologies require specific routing algorithms, which can be difficult to implement. Fortunately, these two extreme points in the design space (i.e. purely regular or completely customized topologies) are not the only possible solutions for NoC architectures. In fact, many technological, biological, and social networks are neither com- pletely regular nor completely irregular [22], [41], [42]. One can view these networks as a superposition of clustered nodes with many short-range links and a few long-range links that pro- duce shortcuts among different regions of the network. The ex- istence of short paths between such remotely situated nodes lies at the heart of the small-world phenomenon, popularly known as six degrees of separation [30], [41]. A useful feature of these small-world networks (e.g., WWW, electrical power grid, and collaboration networks) is the logarithmic relation between the mean internode distance and network size. Starting from this idea, this paper explores the potential of using standard mesh topologies in conjunction with a few ad- ditional long-range links, to improve the performance of NoCs. Inserting a few long-range links to the basic regular architecture clearly reduces the average distance between remotely situated nodes. Furthermore, the node/edge connectivity, hence the net- work reliability, is also improved. However, long-range link in- sertion has to be done judiciously as it has a more pronounced, yet barely studied, impact on the dynamic properties of the net- work characterized by traffic congestion. At low traffic loads, the average packet latency exhibits a weak dependence on the traffic injection rate. However, when the traffic injection rate ex- ceeds a critical value, the packet delivery latency rises abruptly and the network throughput starts collapsing (see Fig. 2). The state before the congestion (i.e., the area at the left-hand side of the critical value) represents the free state, while the state be- yond the critical value is the congested state. Finally, the transi- tion from a free to the congested state is known as phase-tran- sition region. The emergence of congestion in mesh networks can be signif- icantly delayed by introducing a few additional long-range links (see Fig. 2) [15]. It is important to note that, due to the abrupt 1063-8210/$20.00 © 2006 IEEE