Purpose: This study aimed to evaluate radiologists’ performance in breast imaging interpretation when using an AI-based tool developed to assist the radiologist work not only intended as a work of detection of suspicious lesions but as a work of image interpretation in the broadest sense.
Materials and Methods: The evaluated AI system is able to sort the worklist, assess the breast density, the image quality (non-interpretative tasks), compare prior acquired images, detect and characterize findings and support the final decision (interpretative tasks). The tool was evaluated in a multi-reader multi-case study where 25 breast radiologists were asked to interpret 240 combined DBT and 2D examinations (FFDM or 2DSM) with and without the help of AI. Interpretation included: - BI-RADS assignment - Level of Suspicion assessment - Position and description (type and location) of the most informative finding - Breast density category assignment Performances were measured in terms of AUC, sensitivity, specificity and reading time and analyzed with statistical methods for reader studies. The standalone performance of the AI-tool was also assessed separately for interpretative and non-interpretative tasks using confusion matrices and inter-rater agreement analysis. The study was conducted in accordance with the Health Insurance Portability and Accountability Act and approved by an institutional review board.
Results: The average AUC across readers was significantly improved when using the AI system. Sensitivity and specificity were improved as well by 7% (95% CI: 5% to 9.5%) and 5% (95% CI: 1.5 to 8.7%) respectively. Median reading time was 83 seconds without AI and 65 seconds with AI (average difference: -15; 95% CI: -30 to -4.7). The reduction was found to vary in accordance with the use of the proposed triage system: the most pronounced effect was observed for cases assigned with very low or very high AI scores and breast density A. The first situation corresponds to very low suspicious cases thus the AI helps assess them faster, the latter requires the description of a suspicious finding so in this case the time reduction comes from the time saved in reporting it.
Conclusion: It has been demonstrated that the concurrent use of a unique AI tool able to cover multiple tasks of the image interpretation process could improve the diagnostic performance of radiologists in the detection of breast cancer and radically improve the efficiency of their workflow.
Clinical Relevance Statement: This AI system can be integrated into the clinical practice to improve both efficiency and accuracy of the screening programs.