Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Analysis of Large Language Models against the NHS 111 Online Triaging for Emergency Ophthalmology
0
Zitationen
2
Autoren
2024
Jahr
Abstract
<title>Abstract</title> <bold>Background</bold> This study presents a comprehensive evaluation of the performance of various language models in generating responses for ophthalmology emergencies and compares their accuracy with the established NHS 111 online Triage system.<bold>Methods</bold> We included 21 ophthalmology related emergency scenario questions from the 111 triaging algorithm. These questions were based on four different ophthalmology emergency themes as laid out in the NHS 111 algorithm. The responses generated from NHS 111 online, were compared to the different LLM-chatbots responses. We included a range of models including ChatGPT-3.5, Google Bard, Bing Chat, and ChatGPT-4.0. The accuracy of each LLM-chatbot response was compared against the NHS 111 Triage using a two prompt strategy. Answers were graded separately by two different authors as following: −2 graded as “Very poor”, -1 as “Poor”, 0 as “No response”, 1 as “Good”, 2 as “Very good” and 3 graded as “Excellent”.<bold>Results</bold> Overall score of ≥ 1 graded as “Good” or better was achieved by 93% of responses of all LLMs. This refers to at least part of the answer having correct information and partially matching NHS 111 response, as well as the absence of any wrong information or advice which is potentially harmful to the patient’s health.<bold>Conclusions</bold> The high accuracy and safety observed in LLM responses support their potential as effective tools for providing timely information and guidance to patients. While further research is warranted to validate these findings in clinical practice, LLMs hold promise in enhancing patient care and healthcare accessibility in the digital age.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.