THREE DIMENSIONAL FPGA ARCHITECTURES: A SHIFT PARADIGM FOR ENERGY-PERFORMANCE EFFICIENT DSP IMPLEMENTATIONS Kostas Siozios, Dimitrios Soudris and George Economakos School of Electrical and Computer Engineering National Technical University of Athens, Greece {ksiop, dsoudris, geconom}@microlab.ntua.gr ABSTRACT Modern applications exhibit increased complexity which introduces extra constraints during implementation related to delay, power consumption and silicon area. This problem is even more important when we deal with Digital System Processor (DSP) kernels, as there are demands for even higher clock frequencies and logic densities, which cannot be satisfied with existing design technologies. Three-dimensional (3D) integration is an emerging technology that promises to alleviate problems related to performance improvement, but up to now this new design approach has not been sufficiently explored. In this paper we propose a novel 3D FPGA architecture able to implement efficiently DSP applications. The proposed architecture is software-supported by a methodology targeting to explore DSP enhanced 3D FPGA devices. During our study we quantify a number of design parameters, such as the selected number of layers, the proper bonding approach, the process technology for each layer, etc. Comparison results prove the efficiency (in terms of performance and power consumption) of the new design paradigm, as compared to existing commercial devices with similar hardware resources. Index Terms— 3D Architecture, Integration, Interconnection, FPGA, DSP 1. INTRODUCTION The majority of existing and upcoming applications are characterized by increasing computational complexity of the underlying algorithms. Additionally, the continue increasing of data rates introduce extra constraints to real-time applications. The reduced device performance gets even more important whenever the hardware platform targets to DSP kernels, since there is an increasing demand for higher operation frequencies, lower power dissipation and smaller fabrication cost. Up to now there are three platforms (ASIC, DSP processors and FPGA) able to realize applications belonging to the DSP domain, each of which with advantages and disadvantages. Even though ASICs can be tailored to perform specific functions extremely well, in conjunction to power efficiency, however the limitation of no re-programmability restricts their widely acceptance in order to implement general purpose DSP applications (as their functionality cannot be iteratively changed or updated while in product development). On the other hand, the efficiency of altering the implemented algorithm provided by DSP cores and reconfigurable architectures is eliminated by the disadvantage of lower operation frequencies, higher power dissipation and larger silicon area. Based on previous statements, the realizations of demanding applications particularly in the field of multimedia and mobile communications often require processing performance which is far beyond what is delivered by existing hardware platforms today. For decades, semiconductor manufacturers have been shrinking transistor size in ICs to achieve the yearly increases in performance described by Moore's Law, which exists only because the RC delay was negligible, as compared to the signal propagation delay. For submicron technology, however, the RC delay becomes a dominant factor. In addition, in the 130nm technology node approximately 51% of microprocessor power was consumed by interconnect, with a projection that without changes in design philosophy, in the next 5 years up to 80% of microprocessor power will be consumed by interconnect [10]. This has generated many discussions concerning the end of device scaling as we know it, and has hastened the search for solutions beyond the perceived limits of current 2D devices. An emerging solution to this problem is the usage of 3D integration, which replaces a large number of long interconnects (needed in 2D structures) with shorter ones. Such architectures mitigate many of the limitations that the 2D devices exhibit. More specifically, they provide: (i) higher logic density in the same footprint area, (ii) shorter interconnections, (iii) reduced signal propagation delay, (iv) greater versatility and resource utilization, and (v) lower power consumption. The shift from horizontal to vertical stacking of circuits has the potential to rewrite the conventions of electronics design. Although 3D integration promises considerable benefits, several challenges need to be satisfied. Among others, new methodologies and software tools that support design space exploration are required. More specifically, in order to depict in more detail some of the main advantages introduced by the new design paradigm, Table I provides a qualitative comparison among alternative technologies for fabricating ICs. Based on this comparison, the single chip and System-on-Chip approaches are the existing way for developing DSP processors, while the 3D integration technology is more suitable for product development with higher performance 978-1-4244-3298-1/09/$25.00 ©2009 IEEE DSP 2009