IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH 2009 319 An Energy and Performance Exploration of Network-on-Chip Architectures Arnab Banerjee, Student Member, IEEE, Pascal T. Wolkotte, Member, IEEE, Robert D. Mullins, Member, IEEE, Simon W. Moore, Senior Member, IEEE, and Gerard J. M. Smit Abstract—In this paper, we explore the designs of a cir- cuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy re- quired to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area met- rics are also presented to allow a comparison of implementation costs. Index Terms—Circuit-switching networks, evaluation, low-power design, measurement, network-on-chip (NoC), packet-switching networks, performance comparison, simula- tion. I. INTRODUCTION I N THE forthcoming era of many-core computing net- works-on-chips (NoCs) represent the only solution that can provide scalable global on-chip communications [1]. Their regular layout not only deals with the problem of complex wire layout but also allows a natural handling of the communica- tion parallelism inherent in many-core systems. Furthermore, NoCs are a key enabling technology for the provision of many additional services ranging from different quality-of-service (QoS) levels to fault-tolerance. Apart from global commu- nications, the other major challenge facing designers now is Manuscript received December 03, 2007; revised April 08, 2008. First pub- lished February 03, 2009; current version published February 19, 2009. This research was conducted within the Smart Chips for Smart Surroundings Project (IST-001908) and supported by the Sixth Framework Programme of the Euro- pean Community. A. Banerjee, R. D. Mullins, and S. W. Moore are with the Com- puter Laboratory, University of Cambridge, Cambridge CB3 0FD, U.K. (e-mail: arnab.banerjee@cl.cam.ac.uk; robert.mullins@cl.cam.ac.uk; simon.moore@cl.cam.ac.uk). P. T. Wolkotte and G. J. M. Smit are with the Department of EEMCS, University of Twente, 7500 AE Enschede, The Netherlands (e-mail: p.t.wolkotte@utwente.nl; g.j.m.smit@utwente.nl). Digital Object Identifier 10.1109/TVLSI.2008.2011232 high power dissipation. Power dissipation issues have grown to such importance that they now directly constrain attainable performance. Additionally, technology trends suggest that with further technology scaling communication power will demand an increasing proportion of the already limited system power budgets. For NoCs, it is now therefore important to understand any performance benefits they can deliver in the context of the power costs they demand. Previous studies into the power consumption of NoCs has fo- cused on the use of high-level power models. Although these can offer rapid power estimates, they do so at the expense of the accuracy of the results. Following on from the work pre- sented by Banerjee et al. in [2], the contribution of this study is a more detailed and accurate power analysis of a range of NoC architectures. These results are then extended by measuring the performance properties of the networks. The comparison of the power demands versus the performance returns of the different NoC designs explored then has strong implications for the class of NoC architectures that should be used. As outlined in Sections III and IV, four different networks, spanning a large range of router architectural families—a Circuit-Switched router, a Wormhole router, a QoS supporting virtual channel (VC) router and a speculative, single cycle virtual channel router—were selected for this study. Com- plete Hardware Description Language (HDL) models of these networks were then synthesized, placed and routed using a standard application-specific integrated circuit (ASIC) tool flow, with a 90-nm, high-performance CMOS process. There- after, extracted parasitics allowed accurate power and energy figures to be obtained for a variety of experiments, outlined in Section VI-A. Section VI-B then characterizes the performance of the networks under a range of synthetic traffic patterns, which are combined with the measured power results to express an energy-delay product metric for the designs in Section VI-C. Finally, the area measurements reported in Section VI-D allow the implementation costs to be judged. II. RELATED WORK Power consumption has become a major design constraint for processing architectures. A good summary of the field and its problems has been provided by Mudge [3]. A brief overview of existing work specific to NoC power characterization is pro- vided here. Peh et al. have developed insightful high-level power models for a set of NoC router components and used these to estimate the power consumption of various wormhole and virtual channel NoC architectures [4], [5]. Although such high-level models may provide valuable power estimates early in the design cycle, 1063-8210/$25.00 © 2009 IEEE Authorized licensed use limited to: IEEE Xplore. Downloaded on March 25, 2009 at 04:31 from IEEE Xplore. Restrictions apply.