Journal of Breast Imaging, 2022, 488–495
doi:10.1093/jbi/wbac046
Original Research
Received: December 5, 2021; Editorial Acceptance: June 5, 2022
Published Online: September 7, 2022
488 © Society of Breast Imaging 2022. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Original Research
Multicenter, Multivendor Validation of an FDA-
approved Algorithm for Mammography Triage
Tara A. Retson, MD, PhD,
1,*,
Alyssa T. Watanabe, MD,
2,3,
Hoanh Vu, PhD,
3
Chi Yung Chim, PhD
3
1
University of California School of Medicine, Department of Radiology, La Jolla, CA, USA;
2
University of Southern
California Keck School of Medicine, Department of Radiology, Los Angeles, CA, USA;
3
CureMetrix, Inc., La Jolla, CA, USA
*Address correspondence to T.A.R. (e-mail: tretson@ucsd.edu)
Abstract
Objective: Artificial intelligence (AI)–based triage algorithms may improve cancer detection and
expedite radiologist workflow. To this end, the performance of a commercial AI-based triage algo-
rithm on screening mammograms was evaluated across breast densities and lesion types.
Methods: This retrospective, IRB-exempt, multicenter, multivendor study examined 1255 screening
4-view mammograms (400 positive and 855 negative studies). Images were anonymized by pro-
viding institutions and analyzed by a commercially available AI algorithm (cmTriage, CureMetrix,
La Jolla, CA) that performed retrospective triage at the study level by flagging exams as “sus-
picious” or not. Sensitivities and specificities with confidence intervals were derived from area
under the curve (AUC) calculations.
Results: The algorithm demonstrated an AUC of 0.95 (95% CI: 0.94–0.96) for case identification.
Area under the curve held across densities (0.95) and lesion types (masses: 0.94 [95% CI: 0.92–0.96]
or microcalcifications: 0.97 [95% CI: 0.96–0.99]). The algorithm has a default sensitivity of 93% (95%
CI: 95.6%–90.5%) with specificity of 76.3% (95% CI: 79.2%–73.4%). To evaluate real-world perfor-
mance, a sensitivity of 86.9% (95% CI: 83.6%–90.2%) was tested, as observed for practicing radi-
ologists by the Breast Cancer Surveillance Consortium (BCSC) study. The resulting specificity was
88.5% (95% CI: 86.4%–90.7%), similar to the BCSC specificity of 88.9%, indicating performance
comparable to real-world results.
Conclusion: When tested for lesion detection, an AI-based triage software can perform at the
level of practicing radiologists. Drawing attention to suspicious exams may improve reader
specificity and help streamline radiologist workflow, enabling faster turnaround times and
improving care.
Key words: deep learning; screening mammography triage; workflow improvement.
Introduction
Screening mammography saves lives through early detection
of breast cancer (1,2). When read by two independent radi-
ologists, the rate of cancer detection increases and patient
recalls decrease (3–5). Although there is an obvious benefit,
double reading of exams is often impractical with 39 mil-
lion mammograms performed each year in the United States
alone, in addition to an ongoing shortage of radiologists in
Europe and a projected shortage for the United States (6,7).
Computer-aided detection (CAD) software for mammog-
raphy was developed as a way to improve performance and
augment as a second reader. Following initial Food and Drug
Administration (FDA) approval for mammography CAD in
1998, CAD was approved by the Centers for Medicare and
Downloaded from https://academic.oup.com/jbi/article/4/5/488/6693780 by SBI Member Access user on 20 September 2024