Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the diagnostic accuracy of vision language models for neuroradiological image interpretation

2025·1 Zitationen·npj Digital MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

This study evaluates the diagnostic performance of commercial and open-source Vision-Language Models (VLMs) in neuroradiological image interpretation, using a dataset of 100 brain and spine cases from Radiopaedia. Five VLMs (Gemini 2.0, OpenAI o1, Llama 3.2 90b, Qwen 2.5, Grok-2-vision) were compared to expert neuroradiologists in generating differential diagnoses based on brief clinical presentations and imaging. Neuroradiologists achieved a mean accuracy of 86.2%, whereas the best-performing VLM (Gemini 2.0) reached 35%. Evaluation of the top three differentials improved VLM accuracy marginally, but remained inferior to human experts. Clinical harm analysis revealed frequent diagnostic risks, primarily treatment delays, with harmful outputs in up to 45% of cases. Error analysis showed consistent failure modes including incorrect anatomical localization, inaccurate imaging descriptions, and hallucinated findings. These results highlight the current limitations of VLMs and underscore the importance of expert oversight in neuroradiological diagnosis.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMedical Imaging and AnalysisMachine Learning in Healthcare

Volltext beim Verlag öffnen

Evaluating the diagnostic accuracy of vision language models for neuroradiological image interpretation

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen