Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large Language Models in Clinical Advice: Direct Generation and Retrieval Augmented Generation vs Expert Advice
1
Zitationen
2
Autoren
2025
Jahr
Abstract
The NHS faces mounting pressures, resulting in workforce attrition and growing care backlogs. Pharmacy services, critical for ensuring medication safety and effectiveness, are often overlooked in digital innovation efforts. This pilot study investigates the potential of Large Language Models (LLMs) to alleviate pharmacy pressures by answering clinical pharmaceutical queries. Two retrieval techniques were evaluated: Vanilla Retrieval Augmented Generation (RAG) and Graph RAG, supported by an external knowledge source designed specifically for this study. ChatGPT 4o without retrieval served as a control. Quantitative and qualitative evaluations were conducted, including expert human assessments for response accuracy, relevance, and safety. Results demonstrated that LLMs can generate high-quality responses. In expert evaluations, Vanilla RAG outperformed other models and even human reference answers for accuracy and risk. Graph RAG revealed challenges related to retrieval accuracy. Despite the promise of LLMs, hallucinations and the ambiguity around LLM evaluations in healthcare remain key barriers to clinical deployment. This pilot study underscores the importance of robust evaluation frameworks to ensure the safe integration of LLMs into clinical workflows. However, regulatory bodies have yet to catch up with the rapid pace of LLM development. Guidelines are urgently needed to address the issues of transparency, explainability, data protection, and validation, to facilitate the safe and effective deployment of LLMs in clinical practice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.469 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.358 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.803 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.542 Zit.