Journal of Breast Imaging, 2022, 488–495 doi:10.1093/jbi/wbac046 Original Research Received: December 5, 2021; Editorial Acceptance: June 5, 2022 Published Online: September 7, 2022 488 © Society of Breast Imaging 2022. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com Original Research Multicenter, Multivendor Validation of an FDA- approved Algorithm for Mammography Triage Tara A. Retson, MD, PhD, 1,*, Alyssa T. Watanabe, MD, 2,3, Hoanh Vu, PhD, 3 Chi Yung Chim, PhD 3 1 University of California School of Medicine, Department of Radiology, La Jolla, CA, USA; 2 University of Southern California Keck School of Medicine, Department of Radiology, Los Angeles, CA, USA; 3 CureMetrix, Inc., La Jolla, CA, USA *Address correspondence to T.A.R. (e-mail: tretson@ucsd.edu) Abstract Objective: Artiﬁcial intelligence (AI)–based triage algorithms may improve cancer detection and expedite radiologist workﬂow. To this end, the performance of a commercial AI-based triage algo- rithm on screening mammograms was evaluated across breast densities and lesion types. Methods: This retrospective, IRB-exempt, multicenter, multivendor study examined 1255 screening 4-view mammograms (400 positive and 855 negative studies). Images were anonymized by pro- viding institutions and analyzed by a commercially available AI algorithm (cmTriage, CureMetrix, La Jolla, CA) that performed retrospective triage at the study level by ﬂagging exams as “sus- picious” or not. Sensitivities and speciﬁcities with conﬁdence intervals were derived from area under the curve (AUC) calculations. Results: The algorithm demonstrated an AUC of 0.95 (95% CI: 0.94–0.96) for case identiﬁcation. Area under the curve held across densities (0.95) and lesion types (masses: 0.94 [95% CI: 0.92–0.96] or microcalciﬁcations: 0.97 [95% CI: 0.96–0.99]). The algorithm has a default sensitivity of 93% (95% CI: 95.6%–90.5%) with speciﬁcity of 76.3% (95% CI: 79.2%–73.4%). To evaluate real-world perfor- mance, a sensitivity of 86.9% (95% CI: 83.6%–90.2%) was tested, as observed for practicing radi- ologists by the Breast Cancer Surveillance Consortium (BCSC) study. The resulting speciﬁcity was 88.5% (95% CI: 86.4%–90.7%), similar to the BCSC speciﬁcity of 88.9%, indicating performance comparable to real-world results. Conclusion: When tested for lesion detection, an AI-based triage software can perform at the level of practicing radiologists. Drawing attention to suspicious exams may improve reader speciﬁcity and help streamline radiologist workﬂow, enabling faster turnaround times and improving care. Key words: deep learning; screening mammography triage; workﬂow improvement. Introduction Screening mammography saves lives through early detection of breast cancer (1,2). When read by two independent radi- ologists, the rate of cancer detection increases and patient recalls decrease (3–5). Although there is an obvious beneﬁt, double reading of exams is often impractical with 39 mil- lion mammograms performed each year in the United States alone, in addition to an ongoing shortage of radiologists in Europe and a projected shortage for the United States (6,7). Computer-aided detection (CAD) software for mammog- raphy was developed as a way to improve performance and augment as a second reader. Following initial Food and Drug Administration (FDA) approval for mammography CAD in 1998, CAD was approved by the Centers for Medicare and Downloaded from https://academic.oup.com/jbi/article/4/5/488/6693780 by SBI Member Access user on 20 September 2024