Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
SRS54 - How have generic large language models progressed in their ability to write clinical letters and manage patients in the virtual fracture clinic?
0
Zitationen
8
Autoren
2026
Jahr
Abstract
Abstract Objective to evaluate the progression of large language models (LLMs) and their ability to write clinic letters and management plans for common orthopaedic scenarios. Methods Fifteen clinical scenarios were generated and GPT-4, Chat-GPT and GPT-3 were single prompted to write clinic letters and management plans. Letters were assessed for readability using the Readable Tool. Accuracy of letters and management plans were assessed by six independent blinded orthopaedic consultants. Results Readability was compared using Flesch-Kincade Grade Level (GPT-4:9.11;(SD 0.98);ChatGPT:8.77 (SD 0.918);GPT-3:8.47 (SD 0.982)), Flesch Readability Ease (GPT-4:34.26 (SD 7.91);ChatGPT:58.2 (SD 4.00);GPT-3,59.3 (SD 6.98)). GPT-4, Chat-GPT and GPT-3 produced accurate letters (Mean = 8.75/10 (SD 0.96), 8.7/10 (SD 0.60), 7.3/10 (SD 1.41)) respectively. GPT4 and Chat-GPT had a significantly increased letter accuracy compared to GPT-3 (P = 0.024, P = 0.019). Consultant-rated accuracy comparisons across 4.0, 3.5 and 3.0 revealed that ChatGPT-4 exhibited the highest accuracy for management plans (9.08/10 95%c.i., 8.25–9.9). This represents a statistically significant progression of the ability of a large language model to provide accurate management plans from GPT-3 6.84 (95% c.i., 5.41–8.27), to ChatGPT 7.63 to GPT4 (P < 0.0001). Conclusions This study shows that next generation LLMs are effective for generation of clinic letters which are readable and accurate. Further, LLMs can produce generic management plans that are often accurate, demonstrating their evolving improvement. Given these findings a specific LLM trained on accurate and secure healthcare data could be an excellent streamlining tool for clinicians in high demand areas such as virtual fracture clinics.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.380 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.243 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.671 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.496 Zit.