Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Mitigative Strategies for Recovering From Large Language Model Trust Violations
2
Zitationen
3
Autoren
2024
Jahr
Abstract
In this study, we investigated strategies to address trust issues arising from errors in large language models (LLMs). The study examined the impact of confidence scores, system capability explanations, and user feedback on trust restoration post-error. 68 participants viewed the responses of an LLM to 20 general trivia questions, with an error introduced on the third trial. Each participant was presented with one mitigation strategy. Participants rated their overall trust in the model and the reliability of the answer. Results showed an immediate drop in trust after the error; however, there were no differences across the three strategies in trust recovery. All conditions had a logarithmic trend in trust recovery following error. Differences in overall trust were predicted by perceived reliability of the answer, suggesting that participants were evaluating results critically and using that to inform their trust in the model. Qualitative data supported this finding; participants expressed lasting distrust despite the LLM’s later accuracy. Results showcase the need to prioritize accuracy in LLM deployment, because early errors may irrevocably damage user trust calibration and later adoption.
Ähnliche Arbeiten
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.305 Zit.
Generative Adversarial Nets
2023 · 19.841 Zit.
Visualizing and Understanding Convolutional Networks
2014 · 15.236 Zit.
"Why Should I Trust You?"
2016 · 14.204 Zit.
On a Method to Measure Supervised Multiclass Model’s Interpretability: Application to Degradation Diagnosis (Short Paper)
2024 · 13.103 Zit.