Multiple V dd on 3D NoC Architectures Kostas Siozios, Iraklis Anagnostopoulos and Dimitrios Soudris School of Electrical and Computer Enginerring National Technical University of Athens, Greece {ksiop, iraklis, dsoudris}@microlab.ntua.gr Abstract The communication problem is a challenge issue for Integrated Circuits (ICs), which usually becomes a bottleneck for performance improvement. Three-dimensional integration (3D), as well as network-on-chip (NoC), are two recent design approaches that promise to alleviate the consequences of interconnection degradation. This paper introduces a new methodology for power- efficient application mapping onto 3D NoC-based devices. By clustering into the same router, IP cores with similar communication demands, it is possible to achieve reasonable energy savings while meeting timing constraints. Experimental results prove the efficiency of the proposed methodology since we achieve energy savings and temperature reduction up to 19% and 11%, respectively. Keywords- Multiple Vdd, Network-on-Chip, Voltage scaling I. INTRODUCTION As microelectronic industry moves towards many-core chips, designers face challenges related to interconnection issues. Among others, limited number of cores can be attached to busses, whereas the length of busses usually becomes a bottleneck since it does not scale linearly with transistor size. For instance, regarding 65nm node, RC delay in 1mm global wire at minimum pitch is about 100× higher than NMOSFET intrinsic delay [12]. Also, since interconnection resources dissipate about 50% of the total dynamic power consumption [11], routing architecture needs to be refined. In order to alleviate the consequences of this limitation, numerous techniques that replace long interconnections have already proposed. The most suitable among them is the usage of an on-chip network (also known as Network-on-Chip, or NoC) for bus replacement. Since this technique promises a more scalable interconnection architecture, up to now many improvements have already proposed that span from designing more efficient hardware blocks (i.e., network interfaces, routers, etc) [2], up to use a heterogeneous interconnection fabric [3] or a different NoC topology [4]. Apart from hardware optimizations, mentionable design gains are also feasible to be achieved at software level. Typical instantiations are the algorithms and tools for application mapping under different criteria (i.e., timing [1], power/energy [5], and thermal [6]). Since the dynamic power is reduced quadratically as the supply voltage scales down, the multi-V dd technique is a low-power approach, already applied to numerous platforms [13, 14, 15]. Previous studies report that this technique leads to dynamic power savings ranging between 40 and 45% [16, 17]. Regarding existing multi-V dd architectures, they are implemented, as follows: (i) by using multiple power networks, each of which delivers a different supply voltage, while the selection of proper power supply for each IP block is performed by supporting tools, and (ii) by employing only one power network and use a special purpose structure (named level converter) whenever a low power supplied block drives a high power supplied block. Even though the first approach does not require any special purpose hardware for signal propagation across regions of the architecture powered by different supply voltages, the number of employed power wires is proportional to the distinct supply voltages. Alternatively, the second approach does not increase the number of power wires, but requires a level converter block. In this paper, we introduce a novel algorithm for application mapping onto 3D NoC-based mesh architecture. The primary goal during this study is to achieve as much as possible energy savings without affecting application’s functionality. The proposed mapping algorithm initially clusters functionalities based on their communication demand, and then it assigns them onto proper network routers. Since non all of the routers have similar traffic requirements, it is possible to tune their performance for achieving the maximum energy savings. The main contributions of this work are summarized, as follows: (i) we prove that existing way for designing homogeneous NoC architectures is not efficient, since non all of the hardware elements need to operate under unique supply voltage, and (ii) we introduce a high-level mapping algorithm for supporting application mapping onto the multi-V dd 3D NoCs. Based on experimental results, our methodology leads about to 19% and 11% energy savings and temperature reduction, respectively. The rest of the paper is organized, as follows: Section 2 describes the main features of the employed underline 3D architecture, whereas the proposed methodology is introduced in section 3. Section 4 discusses the experimental results that prove the efficiency of our solution, whereas conclusions are summarized in section 5. II. THE EMPLOYED 3D NOC ARCHITECTURE The employed NoC architecture, as it is shown in Figure 1, consists of two layers, each of which contains a number of routers arranged at mesh topology ,whereas each of the routers may be connected to one (or more) IP cores. The hardware resources assigned to bottom layer (mentioned as  ) are powered with low voltage supply (  ), whereas the upper one (  ) is powered under high  (  ). Since the upper layer consumes more power, such a layer ordering result to better thermal profile (we assume that the cooling mechanism is attached on the top of the 3D NoC). Regarding our architecture, we provide multi-V dd with the usage of level converters, since the significant reduction of wires, as compared to first approach, results to additional power savings. In this architecture, the communication among IP cores is performed through routers, which have designed and simulated in VHDL. Each of these routers has six ports marked as “IP” (for connecting to IP core), “N” (north), “S” (south), “W” (west), “E” (east), and “U” (up) or “D” (down). Detail info regarding the structure of the routers design can be found in [7]. In Figure 1(b) we also depict the level converter that required whenever a block powered with  needs to drive a block powered with  . On the other hand, there is no need for additional hardware whenever a  block drives a  one. Last years, numerous level converters have been proposed. In [13, 16] DCVS converters were shown, while level converters with data latch function were introduced in [17, 18]. These structures need both the  and  supplies, while the required power/ground routing poses additional constraints. 978-1-4244-8157-6/10/$26.00 ©2010 IEEE 833 ICECS 2010 978-1-4244-8156-9/10/$26.00 ©2010 IEEE 833 ICECS 2010