Purpose: Recent studies have shown that AI helps radiologists find more cancers in practice. What has not been shown in practice at scale is whether AI helps reduce interval cancer rate (ICR). Interval cancers are a subset of dangerous cancers that are only detected once symptomatic. We sought to determine whether categorical AI could reduce ICR. We collected real-world data from over half a million patients and assessed if the deployment of a categorical CADe/x AI helped radiologists reduce ICR in practice.
Materials and Methods: We studied ICR at 124 outpatient radiology sites performing mammograms both before (15 months, 532,250 screening exams) and after (3 months, 156,013 screening exams) the deployment of a categorical AI. 169 MQSA-qualified radiologists read exams in both periods. Biopsy data was collected during the 12-months following each exam. Although AI categories were not available to radiologists pre-deployment, the AI was retrospectively evaluated on these exams. The AI outputs one of four suspicion categories: Minimal, Low, Intermediate or High, corresponding to how likely a suspicious finding is present and representing approximately the lowest to highest 25%, 50%, 20%, and 5% of a screening population. Cancer detection rate (CDR), recall rate and ICR were measured for both periods for each suspicion category. An interval cancer was defined as a biopsy-confirmed malignant pathology within 12 months after a negatively interpreted screening exam. Statistical significance was calculated using a GLM-based approach.
Results: Before deployment, 163 of 532,250 patients had interval cancer and after deployment, 44 of 156,013 patients had interval cancer. The ICR decreased by 10% from 0.31 to 0.28 per 1000 exams (p = 0.32) after deployment, which was primarily driven by a decrease of interval cancers in AI’s high suspicion category (from 1.5 to 1.23). The CDR increased by 14% from 5.1 to 5.8 (p < 0.05) and the recall rate increased by 6% from 12.4 to 13.1 per 100 exams (p < 0.001).
Conclusion: CDR increased by approximately the same percent as ICR decreased. Although the decrease in ICR was not statistically significant, the large reduction of ICR was most evident in exams given AI’s high category, suggesting radiologists benefited the most in this category.
Clinical Relevance Statement: A large-scale deployment of a categorical AI helped radiologists to detect more cancers earlier and miss fewer cancers.