Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Boosting LLM-assisted diagnosis: 10-minute LLM tutorial elevates radiology residents’ performance in brain MRI interpretation
1
Zitationen
15
Autoren
2025
Jahr
Abstract
To evaluate the impact of a structured tutorial on the use of a large language model (LLM)-based search engine on radiology residents’ performance in brain MRI differential diagnosis. In this study, nine radiology residents determined the three most likely differential diagnoses for three sets of ten brain MRI cases with a challenging yet definite diagnosis. Each set was assessed (1) with the support of conventional internet search, (2) using an LLM-based search engine (© Perplexity AI) without prior tutorial, or (3) using the LLM-based search engine after a structured 10-minute tutorial. Reader responses were rated using a binary and numeric scoring system. Reading times and confidence levels (measured on a 5-point Likert scale) were recorded for each case. Search engine logs were examined to quantify user interaction metrics, and to identify hallucinations and misinterpretations in LLM responses. Radiology residents achieved the highest accuracy when employing the LLM-based search engine following the tutorial, indicating the correct diagnosis among the top three differential diagnoses in 62.5% of cases (55/88). This was followed by the LLM-assisted workflow before the tutorial (44.8%; 39/87) and the conventional internet search workflow (32.2%; 28/87). The LLM tutorial led to significantly higher performance (binary scores: p = 0.042, numeric scores: p = 0.016) and confidence (p = 0.006) but resulted in no relevant differences in reading times. Hallucinations were found in 5.1% of LLM queries. Our findings demonstrate the considerable benefits that even low-effort educational interventions on LLMs can provide, highlighting their potential role in radiology training programs.
Ähnliche Arbeiten
Refinement and reassessment of the SERVQUAL scale.
1991 · 3.967 Zit.
Radiobiology for the Radiologist.
1974 · 3.502 Zit.
ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee
2017 · 2.422 Zit.
Accuracy of Physician Self-assessment Compared With Observed Measures of Competence
2006 · 2.324 Zit.
Technology as an Occasion for Structuring: Evidence from Observations of CT Scanners and the Social Order of Radiology Departments
1986 · 2.247 Zit.