Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Benchmarking Large Language Models for Drug Combination Alerts: Achieving Expert-Level Reliability via Knowledge Grounding and Contextual Reasoning
0
Zitationen
7
Autoren
2026
Jahr
Abstract
Large language models (LLMs) have emerged as promising tools in the healthcare sector. However, their reliability in the critical task of identifying risky drug combinations remains unvalidated. Here, we systematically evaluated the potential of LLMs for drug combination alerting under the guidance of the CoMed framework through four aspects: (1) the baseline performance of native LLMs, (2) the contribution of external knowledge grounding via Retrieval-Augmented Generation (RAG), (3) the impact of expert-guided reasoning using context engineering, and (4) the utility of a multiagent architecture for comprehensive and interpretable risk analysis. Notably, by integrating RAG and the context engineering strategy, Qwen2.5-Max-CoT achieved outstanding performance (F1 = 0.971, AUC = 0.982), demonstrating expert-level balance between precision and recall. Furthermore, a case study on aspirin-warfarin validated CoMed's ability to generate accurate assessments in a structured and traceable HTML report. This study demonstrates that enhanced LLMs can reliably and transparently support drug combination risk alerting and clinical decision.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.