Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Real-World Performance of Large Language Models in Emergency Department Chest Pain Triage

2024·5 Zitationen·medRxivOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Abstract Background Large Language Models (LLMs) are increasingly being explored for medical applications, particularly in emergency triage where rapid and accurate decision-making is crucial. This study evaluates the diagnostic performance of two prominent Chinese LLMs, “Tongyi Qianwen” and “Lingyi Zhihui,” alongside a newly developed model, MediGuide-14B, comparing their effectiveness with human medical experts in emergency chest pain triage. Methods Conducted at Peking University Third Hospital’s emergency centers from June 2021 to May 2023, this retrospective study involved 11,428 patients with chest pain symptoms. Data were extracted from electronic medical records, excluding diagnostic test results, and used to assess the models and human experts in a double-blind setup. The models’ performances were evaluated based on their accuracy, sensitivity, and specificity in diagnosing Acute Coronary Syndrome (ACS). Findings “Lingyi Zhihui” demonstrated a diagnostic accuracy of 76.40%, sensitivity of 90.99%, and specificity of 70.15%. “Tongyi Qianwen” showed an accuracy of 61.11%, sensitivity of 91.67%, and specificity of 47.95%. MediGuide-14B outperformed these models with an accuracy of 84.52%, showcasing high sensitivity and commendable specificity. Human experts achieved higher accuracy (86.37%) and specificity (89.26%) but lower sensitivity compared to the LLMs. The study also highlighted the potential of LLMs to provide rapid triage decisions, significantly faster than human experts, though with varying degrees of reliability and completeness in their recommendations. Interpretation The study confirms the potential of LLMs in enhancing emergency medical diagnostics, particularly in settings with limited resources. MediGuide-14B, with its tailored training for medical applications, demonstrates considerable promise for clinical integration. However, the variability in performance underscores the need for further fine-tuning and contextual adaptation to improve reliability and efficacy in medical applications. Future research should focus on optimizing LLMs for specific medical tasks and integrating them with conventional medical systems to leverage their full potential in real-world settings.

Autoren

Institutionen

Themen

Emergency and Acute Care StudiesAcute Myocardial Infarction ResearchArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Real-World Performance of Large Language Models in Emergency Department Chest Pain Triage

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen