A novel DRAM architecture as a low leakage
alternative for SRAM caches in a 3D interconnect
context.
Anselme Vignon, Stefan Cosemans, Wim Dehaene
K.U. Leuven
ESAT - MICAS Laboratory
Kasteelpark Arenberg 10, Leuven, Belgium
anselme.vignon@esat.kuleuven.be
Pol Marchal, Marco Facchini
IMEC
Kapeldreef 75, B-3001 Leuven, Belgium
pol.marchal@imec.be
Abstract—This paper presents a DRAM architecture that
improves the DRAM performance/power trade-off to increase
their usability on low power chip design using 3D interconnect
technology. The use of a finer matrix subdivision and buffering
the bitline signal at the localblock level allows to reduce both the
energy per access and the access time. The obtained performances
match those of a typical low power SRAM, while achieving a
significant area and static power reduction compared to these
memories.
The 128 kb memory architecture proposed here achieves an
access time of 1.3 ns for a dynamic energy of less than 0.2 pJ
per bit. A localized refresh mechanism allows gaining a factor
of 10 in static power consumption associated with the cell, and
a factor of 2 in area, when compared with an equivalent SRAM.
I. CONTEXT
As feature size reduces, on-chip memory design is becoming
more and more challenging. Reducing the typical dimensions
and the supply voltage for SRAM memories degrades the
cell stability [1]. The stability is degraded further by intra-
die variations which lead in addition to increased average
power consumption. Several solutions have been investigated
to reduce this issue, from changing the cell topology [2] [3]
[4] to modifying the peripheral architecture [5]. However,
these solutions increase the memory area and thus compromise
scaling. Embedded DRAM (eDRAM) has been proposed for
large memory arrays. eDRAM clock speed and access time
have been improved to match the SRAM typical behavior
[6]. However, using eDRAM requires to integrate more dense
capacitors in the logic technology process, and thus needs
costly additional process steps.
3D interconnect enables the use of heterogeneous technolo-
gies on the same chip. 3D vias are typically smaller and have
less parasitic capacitance than off-chip connections [7]. In
addition, they can be spread across the chip. This reduces
the routing energy, and increases the number of available
connections between two stacked dies.
These advantages allow to provide a better bandwidth-
energy trade off for the routing between two stacked dies
than between two packaged dies. A possible application of 3D
interconnect is to separate the logic core of a system from the
Fig. 1. Global architecture - WL/BL subdivision
Local_Address
Block_address
Global_SA
Mux
GBL
data_out
LWL receiver
Local SA
32x32 cells x16
x16
GWL
memory it requires. Such systems have already been studied
in [8] [9], with stacks of an SRAM matrix on top of a logic
layer. It is also possible to stack DRAM on top of a logic
layer.
This solution offers numerous other advantages compared to
packaged DRAM, including simpler inputs/outputs protocol,
and can solve the terminations and clock synchronisation
issues by using shorter connections. This allows using conven-
tional DRAM instead of SRAM or embedded DRAM for the
largest memories in SOC, bringing a higher density compared
to SRAM, without the need to integrate dedicated capacitors
in the logic process, as for eDRAM.
However, traditional DRAM is outperformed by SRAM in
several domains. The typical access time of a DRAM is still
higher than for an SRAM, and the access energy per bit is
higher. This makes conventional DRAM not suited for high
activity caches, where dynamic access energy consumption
and delay are critical.
978-3-9810801-5-5/DATE09 © 2009 EDAA