370 4-900784-01-X 2005 Symposium on VLSI Circuits Digest of Technical Papers 23-4 A 512Mbit, 1.6Gbps/pin DDR3 SDRAM Prototype with C IO Minimization and Self-Calibration Techniques Churoo Park, HoeJu Chung, Yun-Sang Lee, Jae-Kwan Kim, Jae-Jun Lee, Moo-Sung Chae, Dae-Hee Jung, Sung-Ho Choi, Seung-young Seo, Taek-Seon Park, Jun-Ho Shin, Jin-Hyung Cho, Seunghoon Lee, Kyu-hyoun Kim, Jung-Bae Lee, Changhyun Kim and Soo-In Cho DRAM Design, Memory Division, Samsung Electronics Company, Hwasung, Korea Abstract A 1.5V, 512Mbit DDR3 Synchronous DRAM prototype with 1.6Gbps/pin was designed in 80nm technology. Output drivers are merged with ODT and are armed with SCR Type ESD protection, rendering C IO minimization for the enhanced signal integrity in point-to-2points interfacing. Hybrid latency control scheme is proposed to achieve higher bandwidth as well as to efficiently turn DLL on and off. Temperature read- out and per-bank-refresh is also implemented. Introduction Main memories have been continuously evolving to be faster, more reliable, and more cost-effective. Although it gets more difficult to meet all the goals, we achieve a DDR3 SDRAM prototype without losing the sense of balance. Key to the good signal integrity in P22P (point-to-2points) interface operating in the bandwidth of 1.6Gbps/pin are the minimization of C IO and the competent ODT/OCD calibration schemes, which are realized in this prototype by re-organizing output drivers and calibration schemes. In the 8b-prefetch datapath with 3-stage pipelining, a newly devised hybrid-type latency control scheme and a 2-step multiplexing can proficiently handle maximum 128b parallel data in the case of x16 configuration. An efficient protocol for the temperature read-out is proposed, supporting CPU while it controls the heat in high-speed operations. Per-bank-refresh is another experimental feature of this prototype, virtually removing the loss of the memory bandwidth due to the unavoidable requirement of refresh operation common in all DRAMs. Fig. 1. Chip architecture of 512Mbit DDR3 SDRAM Prototype with 8 banks and 8b prefetch scheme. Architecture Fig. 1 shows the chip architecture of 512Mbit DDR3 SDRAM prototype. Memory array comprises 8 banks and each bank is distributed above and below middle peripheral area, reducing by half the total number of the global IO lines, which are shared by adjacent 4 banks. The 128b parallel data – 64b data from the upper block and 64b data from the lower block in the case of x16 configuration, is serialized through 24:1 multiplexing, which is realized by two-step multiplexing – 24:4 and 4:1, as shown in Fig. 2. Fig. 2 also shows that write data are transferred by the unit of 4, to achieve the on-the-fly BL4 chopping operation. The clocks from the already proved stable DLL system [1] and the control signals from FSM (finite state machine) feed into the hybrid latency control system, whose simplified block diagram is also shown in Fig. 2. Fig. 2. Simplified block diagram of datapath to achieve the bandwidth of 1.6Gbps/pin. The fully parallel latency control scheme [2], although it can provide more bandwidth, does lack the timing margin for turning DLL off to reduce power consumption. The hybrid type latency control scheme can provide reasonable bandwidth without losing the controllability of DLL. The minimum clock cycle time in the case of hybrid type latency scheme is given as A B C D E F G H A B C D E F G H 64 64 DLL DIN/DOUT Control Finite State Machine CMDs Read 0 6 DLL SCLK <0:6> TCLK <0:6> CLK Buffer CK/CKB Read_CK CMD_CK DLL_CK Cell Array with IO SA Decoded Address Array_Cntl 24:4 Mux Exit_CLK <0:5> 4:1 Dout Buffer Dout_CLK <0:3> 4 8 8 Latency Control DOUT Serialization Control Latency 4 4 8 DS(2nd) DS(4th) 4 4 4 1:4 Din Buffer DIN Control PAD Write_CK