Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations

2025·0 Zitationen·ArXiv.orgOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

This position paper highlights a growing trend in XAI research where LLMs are used to translate explainability outputs into natural language, seemingly making model predictions more accessible to users. While this approach can improve interpretability, recent findings suggest that human-like explanations do not necessarily enhance user understanding and may instead lead to overreliance on AI systems. When LLMs passively summarize XAI outputs without surfacing model limitations, uncertainties, or inconsistencies, they risk reinforcing the illusion of interpretability rather than fostering meaningful transparency. We propose that—instead of merely translating XAI outputs—LLMs should serve as devil's advocates, actively interrogating AI explanations by presenting alternative interpretations, potential biases, training data limitations, and cases where the model's reasoning may break down. By challenging assumptions and surfacing uncertainty, LLMs can encourage users to critically (and uncomfortably) engage with AI systems rather than blindly trusting them. This paper examines existing approaches, discusses their limitations, and outlines a path to move beyond surface-level LLM explanations and towards deeper, more reliable AI understanding.

Autoren

Institutionen

MIT Lincoln Laboratory(US)

Themen

Explainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and EducationMachine Learning in Materials Science

Volltext beim Verlag öffnen

Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen