Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

An Exploration of Discrepant Recalls Between AI and Human Readers of Malignant Lesions in Digital Mammography Screening

2025·4 Zitationen·DiagnosticsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Background: The integration of artificial intelligence (AI) in digital mammography (DM) screening holds promise for early breast cancer detection, potentially enhancing accuracy and efficiency. However, AI performance is not identical to that of human observers. We aimed to identify common morphological image characteristics of true cancers that are missed by either AI or human screening when their interpretations are discrepant. Methods: Twenty-six breast cancer-positive cases, identified from a large retrospective multi-institutional digital mammography dataset based on discrepant AI and human interpretations, were included in a reader study. Ground truth was confirmed by histopathology or ≥1-year follow-up. Fourteen radiologists assessed lesion visibility, morphological features, and likelihood of malignancy. AI performance was evaluated using receiver operating characteristic (ROC) analysis and area under the curve (AUC). The reader study results were analyzed using interobserver agreement measures and descriptive statistics. Results: AI demonstrated high discriminative capability in the full dataset, with AUCs ranging from 0.903 (95% CI: 0.862-0.944) to 0.946 (95% CI: 0.896-0.996). Cancers missed by AI had a significantly smaller median size (9.0 mm, IQR 6.5-12.0) compared to those missed by human readers (21.0 mm, IQR 10.5-41.0) (p = 0.0014). Cancers in discrepant cases were often described as having 'low visibility', 'indistinct margins', or 'irregular shape'. Calcifications were observed in 27% of human-missed cancers (42/154) versus 18% of AI-missed cancers (38/210). A very high likelihood of malignancy was assigned in 32.5% (50/154) of human-missed cancers compared to 19.5% (41/210) of AI-missed cancers. Overall inter-rater agreement was poor to fair (<0.40), indicating interpretation challenges of the selected images. Among the human-missed cancers, calcifications were more frequent (42/154; 27%) than among the AI-missed cancers (38/210; 18%) (p = 0.396). Furthermore, 50/154 (32.5%) human-missed cancers were deemed to have a very high likelihood of malignancy, compared to 41/210 (19.5%) AI-missed cancers (p = 0.8). Overall inter-rater agreement on the items assessed during the reader study was poor to fair (<0.40), suggesting that interpretation of the selected images was challenging. Conclusions: Lesions missed by AI were smaller and less often calcified than cancers missed by human readers. Cancers missed by AI tended to show lower levels of suspicion than those missed by human readers. While definitive conclusions are premature, the findings highlight the complementary roles of AI and human readers in mammographic interpretation.

Autoren

Institutionen

Themen

AI in cancer detectionRadiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

An Exploration of Discrepant Recalls Between AI and Human Readers of Malignant Lesions in Digital Mammography Screening

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen