Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
An Exploration of Discrepant Recalls Between AI and Human Readers of Malignant Lesions in Digital Mammography Screening
3
Zitationen
10
Autoren
2025
Jahr
Abstract
<b>Background:</b> The integration of artificial intelligence (AI) in digital mammography (DM) screening holds promise for early breast cancer detection, potentially enhancing accuracy and efficiency. However, AI performance is not identical to that of human observers. We aimed to identify common morphological image characteristics of true cancers that are missed by either AI or human screening when their interpretations are discrepant. <b>Methods:</b> Twenty-six breast cancer-positive cases, identified from a large retrospective multi-institutional digital mammography dataset based on discrepant AI and human interpretations, were included in a reader study. Ground truth was confirmed by histopathology or ≥1-year follow-up. Fourteen radiologists assessed lesion visibility, morphological features, and likelihood of malignancy. AI performance was evaluated using receiver operating characteristic (ROC) analysis and area under the curve (AUC). The reader study results were analyzed using interobserver agreement measures and descriptive statistics. <b>Results:</b> AI demonstrated high discriminative capability in the full dataset, with AUCs ranging from 0.903 (95% CI: 0.862-0.944) to 0.946 (95% CI: 0.896-0.996). Cancers missed by AI had a significantly smaller median size (9.0 mm, IQR 6.5-12.0) compared to those missed by human readers (21.0 mm, IQR 10.5-41.0) (<i>p</i> = 0.0014). Cancers in discrepant cases were often described as having 'low visibility', 'indistinct margins', or 'irregular shape'. Calcifications were observed in 27% of human-missed cancers (42/154) versus 18% of AI-missed cancers (38/210). A very high likelihood of malignancy was assigned in 32.5% (50/154) of human-missed cancers compared to 19.5% (41/210) of AI-missed cancers. Overall inter-rater agreement was poor to fair (<0.40), indicating interpretation challenges of the selected images. Among the human-missed cancers, calcifications were more frequent (42/154; 27%) than among the AI-missed cancers (38/210; 18%) (<i>p</i> = 0.396). Furthermore, 50/154 (32.5%) human-missed cancers were deemed to have a very high likelihood of malignancy, compared to 41/210 (19.5%) AI-missed cancers (<i>p</i> = 0.8). Overall inter-rater agreement on the items assessed during the reader study was poor to fair (<0.40), suggesting that interpretation of the selected images was challenging. <b>Conclusions:</b> Lesions missed by AI were smaller and less often calcified than cancers missed by human readers. Cancers missed by AI tended to show lower levels of suspicion than those missed by human readers. While definitive conclusions are premature, the findings highlight the complementary roles of AI and human readers in mammographic interpretation.
Ähnliche Arbeiten
A survey on deep learning in medical image analysis
2017 · 13.500 Zit.
Dermatologist-level classification of skin cancer with deep neural networks
2017 · 13.129 Zit.
A survey on Image Data Augmentation for Deep Learning
2019 · 11.731 Zit.
QuPath: Open source software for digital pathology image analysis
2017 · 8.101 Zit.
Radiomics: Images Are More than Pictures, They Are Data
2015 · 7.981 Zit.
Autoren
Institutionen
- Vrije Universiteit Amsterdam(NL)
- Radboud University Nijmegen(NL)
- Amsterdam UMC Location VUmc(NL)
- Radboud University Medical Center(NL)
- Dutch Expert Centre for Screening(NL)
- University of Twente(NL)
- Leiden University Medical Center(NL)
- Istituti di Ricovero e Cura a Carattere Scientifico(IT)
- Istituto Oncologico Veneto(IT)
- Universidad Complutense de Madrid(ES)
- Vienna General Hospital(AT)
- Medical University of Vienna(AT)
- The Netherlands Cancer Institute(NL)
- Addenbrooke's Hospital(GB)