XT-PRAGGMA: Crosstalk Pessimism Reduction Achieved with GPU Gate-level Simulations and Machine Learning Vidya A. Chhabria University of Minnesota Minneapolis, MN, USA Ben Keller Yanqing Zhang Sandeep Vollala Sreedhar Pratty NVIDIA Corporation Santa Clara, CA, USA Haoxing Ren Brucek Khailany NVIDIA Corporation Austin, TX, USA Abstract Accurate crosstalk-aware timing analysis is critical in nanometer- scale process nodes. While today’s VLSI fows rely on static timing analysis (STA) techniques to perform crosstalk-aware timing sig- nof, these techniques are limited due to their static nature as they use imprecise heuristics such as arbitrary aggressor fltering and simplifed delay calculations. This paper proposes XT-PRAGGMA, a tool that uses GPU-accelerated dynamic gate-level simulations and machine learning to eliminate false aggressors and accurately predict crosstalk-induced delta delays. XT-PRAGGMA reduces STA pessimism and provides crucial information to identify crosstalk- critical nets, which can be considered for accurate SPICE simulation before signof. The proposed technique is fast (less than two hours to simulate 30,000 vectors on million-gate designs) and reduces falsely-reported total negative slack in timing signof by 70%. CCS Concepts Hardware Physical design (EDA); Methodologies for EDA; Static timing analysis; Timing analysis and sign-of; Simula- tion and emulation; Transition-based timing analysis. Keywords Crosstalk analysis, machine learning (ML), and GPU-accelerated gate-level simulations. ACM Reference Format: Vidya A. Chhabria, Ben Keller, Yanqing Zhang, Sandeep Vollala, Sreedhar Pratty, Haoxing Ren, and Brucek Khailany. 2022. XT-PRAGGMA: Crosstalk Pessimism Reduction Achieved with GPU Gate-level Simulations and Ma- chine Learning. In Proceedings of the 2022 ACM/IEEE Workshop on Machine Learning for CAD (MLCAD ’22), September 12–13, 2022, Snowbird, UT, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3551901.3556483 1 Introduction Analyzing crosstalk is crucial for timing signof in modern digital in- tegrated circuits (ICs) with very small on-chip wire geometries and Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. MLCAD ’22, September 12–13, 2022, Snowbird, UT, USA © 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-9486-4/22/09. . . $15.00 https://doi.org/10.1145/3551901.3556483 A1 A2 A4 A3 Victim Victim window A3 window A2 window A4 window A1 window (a) (b) t Delta delay Figure 1: (a) Physically adjacent nets with shared coupling capacitances and a corresponding slowdown in transition (delta delay). (b) Crosstalk-aware STA operates by finding overlapping timing windows between aggressor and victim. large transistor densities. Parasitic cross-coupling between physi- cally adjacent nets can lead to transition slowdowns or speedups (known as delta delays) or voltage glitches (known as bumps). These efects depend on the switching state and the arrival times of the signal on a net being analyzed (the victim) and one or more coupled nets (aggressors). Fig. 1(a) shows an example of crosstalk in which the victim net is coupled with four aggressors A1–A4 which induce a slowdown on the victim’s transition. It is essential to estimate the delta delay to accurately close timing in today’s highly congested ICs. However, crosstalk analysis is challenging and slow due to the complex nature of delay computation: one must account for considerations such as the arrival times of the victim transitions rel- ative to the aggressor, switching states, variation-dependent arrival times, coupling parasitics, the infuence of multiple aggressors, and the slews of the aggressors and victim nets. The most accurate method of simulating crosstalk efects is to conduct dynamic SPICE simulations, but SPICE simulations on even a few nets are extremely slow. For example, analyzing a single victim net with four aggressors using SPICE can take several hours with 32 (2 5 ) input combinations and a Monte Carlo-based approach to model variation. Today’s ICs have millions of victim nets, each coupled with hundreds of aggressor nets, so it is impractical to scale this approach to the entire design. Instead, design fows rely on signal-integrity-aware static timing analysis (STA) tools [1, 2] to perform timing signof, perhaps followed by SPICE on a few critical STA-identifed nets. However, there are crucial challenges in such a fow. The frst is that STA-based methods used in practice to achieve timing signof are fundamentally static in nature and impose limits on accuracy [3, 6, 7]. The second is that in deep sub- micron technologies, SPICE netlist simulation of even a few critical STA-identifed nets remains computationally expensive because 63