Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations
0
Zitationen
4
Autoren
2025
Jahr
Abstract
This position paper highlights a growing trend in XAI research where LLMs are used to translate explainability outputs into natural language, seemingly making model predictions more accessible to users. While this approach can improve interpretability, recent findings suggest that human-like explanations do not necessarily enhance user understanding and may instead lead to overreliance on AI systems. When LLMs passively summarize XAI outputs without surfacing model limitations, uncertainties, or inconsistencies, they risk reinforcing the illusion of interpretability rather than fostering meaningful transparency. We propose that—instead of merely translating XAI outputs—LLMs should serve as devil's advocates, actively interrogating AI explanations by presenting alternative interpretations, potential biases, training data limitations, and cases where the model's reasoning may break down. By challenging assumptions and surfacing uncertainty, LLMs can encourage users to critically (and uncomfortably) engage with AI systems rather than blindly trusting them. This paper examines existing approaches, discusses their limitations, and outlines a path to move beyond surface-level LLM explanations and towards deeper, more reliable AI understanding.
Ähnliche Arbeiten
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.253 Zit.
Generative Adversarial Nets
2023 · 19.841 Zit.
Visualizing and Understanding Convolutional Networks
2014 · 15.230 Zit.
"Why Should I Trust You?"
2016 · 14.156 Zit.
On a Method to Measure Supervised Multiclass Model’s Interpretability: Application to Degradation Diagnosis (Short Paper)
2024 · 13.093 Zit.