Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Diagnostic Accuracy and Stability of Multimodal Large Language Models for Hand Fracture Detection: A Multi-Run Evaluation on Plain Radiographs
0
Zitationen
5
Autoren
2026
Jahr
Abstract
: Diagnostic accuracy and consistency are distinct performance dimensions; high intra-model agreement does not imply correctness. While GPT-5 Pro demonstrated the most favorable balance of accuracy and stability, other models exhibited critical failure modes ranging from systematic bias to random instability. At present, MLLMs should be regarded as experimental diagnostic reasoning systems rather than reliable standalone tools for clinical fracture detection.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.646 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.554 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.071 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.851 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.