Journal of Breast Imaging, 2022, 488–495 doi:10.1093/jbi/wbac046 Original Research Received: December 5, 2021; Editorial Acceptance: June 5, 2022 Published Online: September 7, 2022 488 © Society of Breast Imaging 2022. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com Original Research Multicenter, Multivendor Validation of an FDA- approved Algorithm for Mammography Triage Tara A. Retson, MD, PhD, 1,*, Alyssa T. Watanabe, MD, 2,3, Hoanh Vu, PhD, 3 Chi Yung Chim, PhD 3 1 University of California School of Medicine, Department of Radiology, La Jolla, CA, USA; 2 University of Southern California Keck School of Medicine, Department of Radiology, Los Angeles, CA, USA; 3 CureMetrix, Inc., La Jolla, CA, USA *Address correspondence to T.A.R. (e-mail: tretson@ucsd.edu) Abstract Objective: Artificial intelligence (AI)–based triage algorithms may improve cancer detection and expedite radiologist workflow. To this end, the performance of a commercial AI-based triage algo- rithm on screening mammograms was evaluated across breast densities and lesion types. Methods: This retrospective, IRB-exempt, multicenter, multivendor study examined 1255 screening 4-view mammograms (400 positive and 855 negative studies). Images were anonymized by pro- viding institutions and analyzed by a commercially available AI algorithm (cmTriage, CureMetrix, La Jolla, CA) that performed retrospective triage at the study level by flagging exams as “sus- picious” or not. Sensitivities and specificities with confidence intervals were derived from area under the curve (AUC) calculations. Results: The algorithm demonstrated an AUC of 0.95 (95% CI: 0.94–0.96) for case identification. Area under the curve held across densities (0.95) and lesion types (masses: 0.94 [95% CI: 0.92–0.96] or microcalcifications: 0.97 [95% CI: 0.96–0.99]). The algorithm has a default sensitivity of 93% (95% CI: 95.6%–90.5%) with specificity of 76.3% (95% CI: 79.2%–73.4%). To evaluate real-world perfor- mance, a sensitivity of 86.9% (95% CI: 83.6%–90.2%) was tested, as observed for practicing radi- ologists by the Breast Cancer Surveillance Consortium (BCSC) study. The resulting specificity was 88.5% (95% CI: 86.4%–90.7%), similar to the BCSC specificity of 88.9%, indicating performance comparable to real-world results. Conclusion: When tested for lesion detection, an AI-based triage software can perform at the level of practicing radiologists. Drawing attention to suspicious exams may improve reader specificity and help streamline radiologist workflow, enabling faster turnaround times and improving care. Key words: deep learning; screening mammography triage; workflow improvement. Introduction Screening mammography saves lives through early detection of breast cancer (1,2). When read by two independent radi- ologists, the rate of cancer detection increases and patient recalls decrease (35). Although there is an obvious benefit, double reading of exams is often impractical with 39 mil- lion mammograms performed each year in the United States alone, in addition to an ongoing shortage of radiologists in Europe and a projected shortage for the United States (6,7). Computer-aided detection (CAD) software for mammog- raphy was developed as a way to improve performance and augment as a second reader. Following initial Food and Drug Administration (FDA) approval for mammography CAD in 1998, CAD was approved by the Centers for Medicare and Downloaded from https://academic.oup.com/jbi/article/4/5/488/6693780 by SBI Member Access user on 20 September 2024