The Dawn of Predictive Chip Yield Design: Along and Beyond the Memory Lane Rajiv Joshi IBM T.J. Watson Research Center Antonio R. Pelella, Arthur Tuminaro, and Yuen Chan IBM Systems and Technology Group Rouwaida Kanj IBM Austin Research Lab WITH THE RAPID scaling of CMOS technology, die- to-die and intradie process variation effects are increasing dramatically. To meet the demand of high density, designers use the smallest devices and most aggressive design rules for SRAM cells. This leads to unavoidable manufacturing variations be- tween neighboring memory cell transistors. Specifi- cally, threshold voltage (V TH ) mismatch between neighboring devices can lead to a large number of fails in memory designs and degrade SRAM perfor- mance and yield. When V TH mismatch is combined with other factors such as narrow width effects, soft error rate (SER), temperature and process variations, and parasitic transistor resistance, SRAM scaling becomes increasingly difficult as a result of reduced margins. 1-3 End-of-life effects can further aggravate the situation. 4 The same problem applies to logic de- sign. In fact, designing for the worst case is simply no longer feasible. Designers have used statistical timing techniques to achieve full-chip and full-process cover- age based on high-level models and to enable robust design practices. 5 Furthermore, statisti- cal techniques improve quality in the context of at-speed testing. But to en- able full-chip analysis, the high-level models sacrifice accuracy and deal mainly with three-sigma estimates. Statistical analysis for custom logic and memory-interacting logic has received little attention, especially analysis involving rare-failure estimation. This is true from the functional behavior perspective as well as the performance per- spective. Thus, we need to capture not only average logic delay distributions but also possible design fail- ures. As the number of design elements (such as latches) increases, it is possible that a rare functional failure might occur, especially when we want to guar- antee the yield for millions of chips. Furthermore, we must analyze the yield of the memory design in situ with the peripheral logic, raising the need for simulta- neous statistical analysis of the memory/logic unit. Figure 1 provides an overview of the components of commonly used chip designs, such as the IBM Power7 processor. Recently published work on the Power7 design shows that it contains close to 1.2 bil- lion transistors and eight cores with L2 and shared L3 caches. 6 As is the case in state-of-the-art microproces- sors, memory units occupy 50% to 60% of the chip, and logic occupies the rest. Therefore, yield prediction through variability analysis, first targeting memory Postsilicon Calibration and Repair for Yield and Reliability Improvement Editor’s note: Statistical approaches for yield estimation and robust design are vital in the cur- rent variation-dominated design era. This article presents a mixture importance sampling methodology to enable yield-driven design and extends its applica- tion beyond memories to peripheral circuits and logic blocks. Rahul Rao, IBM 0740-7475/10/$26.00 c 2010 IEEE Copublished by the IEEE CS and the IEEE CASS IEEE Design & Test of Computers 36