Purpose: A deep learning (DL) algorithm was previously designed to predict a patient’s risk of developing breast cancer using mammographic imaging biomarkers alone. The purpose of this study was to compare the diagnostic accuracy of DL image-only and traditional risk assessment models, to prospectively detect DCIS vs invasive breast cancer in a cohort due for screening mammography.
Materials and Methods: This multisite study included consecutive patients >30 years undergoing routine bilateral screening mammography from 9/18/2017 to 9/17/2021 at five facilities with at least one year of follow-up. DL 5-year and Tyrer-Cuzick version 8 (TC8) 5-year and lifetime models were used to assess risk. Women with a personal history of breast cancer and those without valid risk scores were excluded. Patient demographics were retrieved from electronic medical records. Cancer outcomes were obtained through linkage to a regional tumor registry. DL vs TC8 model performance was compared using areas under the receiver operating characteristic curve (AUCs) with DeLong test (P < 0.05).
Results: 130376 bilateral screening mammograms in 56777 patients met inclusion criteria (mean 59y, IQR: 51-68y). 105146/130376 (80.6%) were in post-menopausal and 25230/130376 (19.4%) in pre-menopausal patients. 78387/130096 (60.3%) had non-dense and 51709/130096 (39.7%) had dense breasts. 105907/130376 (81.2%) of patients were White, 7245/130376 (5.6%) Asian, 6515/130376 (5.0%) Black, 2242/130376 (1.7%) Hispanic and 5909/130376 (4.5%) Other. The AUC of DL model in predicting DCIS or invasive malignancy was 0.67 (95% confidence interval [CI]: 0.65, 0.70) vs 0.57 (95% CI: 0.54, 0.60, P< 0.001) by TC8 5y and 0.50 (95% CI: 0.47, 0.53, P< 0.001) lifetime models. The AUC of DL model in predicting DCIS was 0.70 (95% CI: 0.65, 0.74) vs 0.57 (95% CI: 0.53, 0.62, P< 0.001) by TC8 5y and 0.50 (95% CI: 0.45, 0.55, P< 0.001) lifetime models. The AUC of DL model in predicting invasive malignancy was 0.66 (95% CI: 0.64, 0.69) vs 0.57 (95% CI: 0.53, 0.60, P< 0.001) by TC8 5y and 0.50 (95% CI: 0.47, 0.54, P< 0.001) lifetime models.
Conclusion: Mammograms contain highly predictive biomarkers of cancer risk, not identified by traditional risk models. A DL model using screening mammography alone can improve risk discriminatory accuracy in identifying both DCIS and invasive disease compared to traditional modern risk models.
Clinical Relevance Statement: Traditional risk models can be time-consuming to acquire and rely on inconsistent or missing data. A DL image-only risk model can provide increased access to more accurate, less costly risk assessment for both DCIS and invasive malignancy prediction.