Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can advanced large language models support radiology training? A performance assessment of DeepSeek R1
5
Zitationen
5
Autoren
2025
Jahr
Abstract
<h2>Abstract</h2><h3>Background</h3> Large language models (LLMs) are increasingly used in medical education, including radiology training. DeepSeek-R1, an open-access LLM, has gained attention for its performance and accessibility. This study assesses DeepSeek-R1's ability to answer radiology-related questions based on the European Training Curriculum for Radiology. <h3>Methods</h3> Ninety questions were randomly selected from ten radiology subspecialties, covering Levels 1–3 of the curriculum. Three radiology residents (2nd-year, 4th-year, and subspecialty) reviewed responses for correctness, clarity, and safety using a 5-point scale. Additionally, 5 safety-related and 5 hallucination-based questions were included. Statistical analysis was conducted using RStudio, with the Kruskal-Wallis test assessing differences across groups. A weighted-Kappa test was used to assess inter- and intra-reader agreement. <h3>Results</h3> DeepSeek-R1 demonstrated high scores across correctness (4.1 ± 0.6), clarity (4.7 ± 0.6), and safety (4.8 ± 0.4). No significant differences were found across ESR levels or subspecialties. However, the 4th-year resident rated clarity significantly lower than the other residents (p = 0.0031). The model did not provide hallucinations and dangerous recommendations. <h3>Conclusion</h3> DeepSeek-R1 shows promise as a supplementary educational tool in radiology training, offering accurate and clear responses while minimizing risks of misinformation. However, it remains essential to critically assess any answers from an LLM in case of inaccuracies.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.292 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.143 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.539 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.452 Zit.