XT-PRAGGMA: Crosstalk Pessimism Reduction Achieved with
GPU Gate-level Simulations and Machine Learning
Vidya A. Chhabria
University of Minnesota
Minneapolis, MN, USA
Ben Keller
Yanqing Zhang
Sandeep Vollala
Sreedhar Pratty
NVIDIA Corporation
Santa Clara, CA, USA
Haoxing Ren
Brucek Khailany
NVIDIA Corporation
Austin, TX, USA
Abstract
Accurate crosstalk-aware timing analysis is critical in nanometer-
scale process nodes. While today’s VLSI fows rely on static timing
analysis (STA) techniques to perform crosstalk-aware timing sig-
nof, these techniques are limited due to their static nature as they
use imprecise heuristics such as arbitrary aggressor fltering and
simplifed delay calculations. This paper proposes XT-PRAGGMA,
a tool that uses GPU-accelerated dynamic gate-level simulations
and machine learning to eliminate false aggressors and accurately
predict crosstalk-induced delta delays. XT-PRAGGMA reduces STA
pessimism and provides crucial information to identify crosstalk-
critical nets, which can be considered for accurate SPICE simulation
before signof. The proposed technique is fast (less than two hours
to simulate 30,000 vectors on million-gate designs) and reduces
falsely-reported total negative slack in timing signof by 70%.
CCS Concepts
• Hardware → Physical design (EDA); Methodologies for EDA;
Static timing analysis; Timing analysis and sign-of; Simula-
tion and emulation; Transition-based timing analysis.
Keywords
Crosstalk analysis, machine learning (ML), and GPU-accelerated
gate-level simulations.
ACM Reference Format:
Vidya A. Chhabria, Ben Keller, Yanqing Zhang, Sandeep Vollala, Sreedhar
Pratty, Haoxing Ren, and Brucek Khailany. 2022. XT-PRAGGMA: Crosstalk
Pessimism Reduction Achieved with GPU Gate-level Simulations and Ma-
chine Learning. In Proceedings of the 2022 ACM/IEEE Workshop on Machine
Learning for CAD (MLCAD ’22), September 12–13, 2022, Snowbird, UT, USA.
ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3551901.3556483
1 Introduction
Analyzing crosstalk is crucial for timing signof in modern digital in-
tegrated circuits (ICs) with very small on-chip wire geometries and
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission
and/or a fee. Request permissions from permissions@acm.org.
MLCAD ’22, September 12–13, 2022, Snowbird, UT, USA
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9486-4/22/09. . . $15.00
https://doi.org/10.1145/3551901.3556483
A1
A2
A4
A3
Victim
Victim window
A3 window
A2 window
A4 window
A1 window
(a) (b)
t
Delta delay
Figure 1: (a) Physically adjacent nets with shared coupling
capacitances and a corresponding slowdown in transition
(delta delay). (b) Crosstalk-aware STA operates by finding
overlapping timing windows between aggressor and victim.
large transistor densities. Parasitic cross-coupling between physi-
cally adjacent nets can lead to transition slowdowns or speedups
(known as delta delays) or voltage glitches (known as bumps). These
efects depend on the switching state and the arrival times of the
signal on a net being analyzed (the victim) and one or more coupled
nets (aggressors). Fig. 1(a) shows an example of crosstalk in which
the victim net is coupled with four aggressors A1–A4 which induce
a slowdown on the victim’s transition. It is essential to estimate the
delta delay to accurately close timing in today’s highly congested
ICs. However, crosstalk analysis is challenging and slow due to
the complex nature of delay computation: one must account for
considerations such as the arrival times of the victim transitions rel-
ative to the aggressor, switching states, variation-dependent arrival
times, coupling parasitics, the infuence of multiple aggressors, and
the slews of the aggressors and victim nets.
The most accurate method of simulating crosstalk efects is to
conduct dynamic SPICE simulations, but SPICE simulations on
even a few nets are extremely slow. For example, analyzing a single
victim net with four aggressors using SPICE can take several hours
with 32 (2
5
) input combinations and a Monte Carlo-based approach
to model variation. Today’s ICs have millions of victim nets, each
coupled with hundreds of aggressor nets, so it is impractical to
scale this approach to the entire design. Instead, design fows rely
on signal-integrity-aware static timing analysis (STA) tools [1, 2]
to perform timing signof, perhaps followed by SPICE on a few
critical STA-identifed nets. However, there are crucial challenges
in such a fow. The frst is that STA-based methods used in practice
to achieve timing signof are fundamentally static in nature and
impose limits on accuracy [3, 6, 7]. The second is that in deep sub-
micron technologies, SPICE netlist simulation of even a few critical
STA-identifed nets remains computationally expensive because
63