Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Don’t be my Doctor! Recognizing Healthcare Advice in Large Language Models
1
Zitationen
5
Autoren
2024
Jahr
Abstract
Large language models (LLMs) have seen increasing popularity in daily use, with their widespread adoption by many corporations as virtual assistants, chatbots, predictors, and many more.Their growing influence raises the need for safeguards and guardrails to ensure that the outputs from LLMs do not mislead or harm users.This is especially true for highly regulated domains such as healthcare, where misleading advice may influence users to unknowingly commit malpractice.Despite this vulnerability, the majority of guardrail benchmarking datasets do not focus enough on medical advice specifically.In this paper, we present the HeAL benchmark (HEalth Advice in LLMs) 1 , a health-advice benchmark dataset that has been manually curated and annotated to evaluate LLMs' capability in recognizing health-advice -which we use to safeguard LLMs deployed in industrial settings.We use HeAL to assess several models and report a detailed analysis of the findings.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.