Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

The Expertise Paradox: Who Benefits from LLM-Assisted Brain MRI Differential Diagnosis?

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Purpose To evaluate how reader experience influences the diagnostic benefit from LLM assistance in brain MRI differential diagnosis. Materials and Methods Neuroradiologists (n = 4), radiology residents (n = 4), and neurology/neurosurgery residents (n = 4) were recruited. A dataset of complex brain MRI cases was curated from the local imaging database (n = 40). For each case, readers provided a textual description of the main imaging finding and their top three differential diagnoses ("Unassisted"). Three state-of-the-art large language models (GPT-4.1, Gemini 2.5 Pro, DeepSeek-R1) were prompted to generate top-three differentials based on the clinical case description and reader-specific findings. Readers then revised their differential diagnoses after reviewing GPT-4.1 suggestions ("Assisted"). To evaluate the association between reader experience and diagnostic benefit, a cumulative link mixed model (CLMM) was fitted, with change in diagnostic result as ordinal outcome, reader experience as predictor, and random intercepts for rater and case. Results LLM-generated differential diagnoses achieved the highest top-3 accuracy when provided with image descriptions from neuroradiologists (top-3: 78.8-83.8%), followed by radiology residents (top-3: 71.8-77.6%), and neurology/neurosurgery residents (top-3: 62.6-64.5%). In contrast, mean relative gains in top-3 accuracy through LLM assistance diminished with increasing experience, with +19.2% for neurology/neurosurgery residents (from 43.2% to 62.6%), +14.7% for radiology residents (from 59.6% to 74.4%), and +4.4% for neuroradiologists (from 83.1% to 87.5%). The CLMM demonstrated a significant negative association between reader experience and diagnostic benefit from LLM assistance (β = −0.10, p = 0.005). Conclusion With increasing reader experience, absolute diagnostic LLM performance with reader-generated input improved, while relative diagnostic gains through LLM assistance paradoxically diminished. Our findings call attention to the divergence between standalone LLM performance and clinically relevant reader benefit, and emphasize the need to account for human-AI interaction in this context.

Autoren

Institutionen

Themen

Radiology practices and educationArtificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical Imaging

Volltext beim Verlag öffnen

The Expertise Paradox: Who Benefits from LLM-Assisted Brain MRI Differential Diagnosis?

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen