IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 12, DECEMBER 2006 2919 System-Level Buffer Allocation for Application-Specific Networks-on-Chip Router Design Jingcao Hu, Member,IEEE, Umit Y. Ogras, Student Member, IEEE, and Radu Marculescu, Member,IEEE Abstract—In this paper, a novel system-level buffer planning algorithm that can be used to customize the router design in networks-on-chip (NoCs) is presented. More precisely, given the traffic characteristics of the target application and the total budget of the available buffering space, the proposed algorithm automat- ically assigns the buffer depth for each input channel, in different routers across the chip, such that the overall performance is maximized. This is in deep contrast with the uniform assignment of buffering resources (currently used in NoC design), which can significantly degrade the overall system performance. Indeed, the experimental results show that while the proposed algorithm is very fast, significant performance improvements can be achieved compared to the uniform buffer allocation. For instance, for a complex audio/video application, about 80% savings in buffering resources, can be achieved by smart buffer allocation using the proposed algorithm. Index Terms—Buffer sizing, design automation, low power, networks-on-chip (NoCs), optimization. I. I NTRODUCTION W ITH THE recent advances in the semiconductor tech- nology, it is possible for designers to integrate on a single chip tens of Intellectual Property (IP) blocks together with large amounts of embedded memory. This richness of the computational resources (CPU or DSP cores, video processors, etc.) places tremendous demands on the communication re- sources as well. Additionally, the shrinking feature size in the deep submicrometer (DSM) technologies makes interconnect delay and power consumption the dominant factors in the op- timization of modern systems. Interconnect optimization under DSM effects is complicated because of the worsening effects due to crosstalk, electromagnetic interference, etc. [31]. The NoC approach was proposed as a promising solution to these complex on-chip communication problems [4], [10], [15], [21]. For the NoC architecture, the chip is divided into a Manuscript received March 1, 2005; revised July 5, 2005 and November 23, 2005. This work was supported in part by the National Science Foundation (NSF) under Grant CCR-00-93104 and in part by Marco Gigascale Systems Research Center (GSRC). This paper was recommended by Associate Editor R. Gupta. J. Hu was with the Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213-3890 USA. He is now with Tabula, Inc., Santa Clara, CA 95054 USA (e-mail: jhu@tabula.com). U. Y. Ogras and R. Marculescu are with the Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213- 3890 USA (e-mail: uogras@ece.cmu.edu; radum@ece.cmu.edu). Digital Object Identifier 10.1109/TCAD.2006.882474 Fig. 1. (a) NoC implementing a 2-D mesh topology. (b) Typical on-chip router architecture. set of interconnected blocks (or nodes) where each node can be a general-purpose processor, a DSP, a memory subsystem, etc. Fig. 1(a) shows an example of an NoC implementation where nodes are connected using a simple two-dimensional (2-D) mesh topology. A router is embedded within each node with the objective of connecting it to its neighboring nodes [a typical on- chip router for a 2-D mesh NoC is shown in Fig. 1(b)]. As such, instead of routing design-specific global wires, the internode communication can be achieved by routing packets. Compared to a standard data macro network, an on-chip network is by far more resource limited. To minimize the im- plementation cost, the on-chip network should be implemented with very little area overhead. This is especially important for those architectures composed of nodes designed at a fine level of granularity. The input buffers in a typical on-chip router [highlighted in Fig. 1(b)] take a significant portion of 0278-0070/$20.00 © 2006 IEEE