Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Letter to the Editor: Toward Retrieval-Grounded Evaluation for Conversational LLM-Based Risk Assessment (Preprint)
0
Zitationen
1
Autoren
2026
Jahr
Abstract
<sec> <title>UNSTRUCTURED</title> Abstract This letter provides a methodological commentary on a recently published study describing a conversational large language model–based system for pediatric COVID-19 risk assessment. We discuss how evaluation based solely on large language model–only pipelines and aggregate discrimination metrics may overestimate reliability in conversational clinical applications when factual verifiability is not explicitly assessed. Drawing on recent empirical evidence from retrieval-augmented generation in medical tasks, we highlight the importance of evidence grounding for accuracy interpretation, safety assessment, and subgroup-level auditing. We suggest that retrieval-grounded sensitivity analyses may strengthen the evaluation of conversational AI systems intended for clinical or public-facing use. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.493 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.377 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.835 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.555 Zit.