Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Towards Automated Error Discovery: A Study in Conversational AI

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Although LLM-based conversational agents demonstrate strong fluency and coherence, they still produce undesirable behaviors (errors) that are challenging to prevent from reaching users during deployment.Recent research leverages large language models (LLMs) to detect errors and guide response-generation models toward improvement.However, current LLMs struggle to identify errors not explicitly specified in their instructions, such as those arising from updates to the response-generation model or shifts in user behavior.In this work, we introduce Automated Error Discovery, a framework for detecting and defining errors in conversational AI, and propose SEEED (Soft Clustering Extended Encoder-Based Error Detection), as an encoderbased approach to its implementation.We enhance the Soft Nearest Neighbor Loss by amplifying distance weighting for negative samples and introduce Label-Based Sample Ranking to select highly contrastive examples for better representation learning.SEEED outperforms adapted baselines-including GPT-4o and Phi-4-across multiple error-annotated dialogue datasets, improving the accuracy for detecting unknown errors by up to 8 points and demonstrating strong generalization to unknown intent detection. 11 We provide our code on GitHub: https://github.com/UKPLab/emnlp2025-automatic-error-discovery.2023; Mi et al., 2020;Roller et al., 2020), these changes may lead to the emergence of new error types that the LLM might not recognize.In this work, we address the challenge of error detection in conversational AI.We introduce Automated Error Discovery, a framework for detecting and defining errors in dialogue, and propose SEEED (Soft Clustering Extended Encoder-Based Error Detection) as an approach to its implementation.Our contributions are as follows: We introduce Automated Error Discovery, a framework for (1) detecting both known and unknown error types, and (2) generating definitions for newly discovered ones. We propose SEEED, a novel approach that combines an open-source LLM with lightweight encoders for error detection.In contrast to prior work, SEEED employs soft clustering in the classification step, enabling more contextually coherent groupings. We introduce Label-Based Sample Ranking, a novel sampling strategy for contrastive learning that selects highly contrastive examples based on the error they represent to improve representation learning. We enhance the Soft Nearest Neighbor Loss (Frosst et al., 2019) by introducing a margin parameter to amplify the effect of distance weighting for negative samples.Oh, really?Yes, Rome is an impressive city.I also just came back from summer vacation.I did a lot of surfing!?Summary Encoder I just came back from summer vacation.I've been to Rome.It's such a lovely city!Awesome, that sounds fun!

Autoren

Institutionen

Hess (United States)(US)

Themen

Topic ModelingArtificial Intelligence in Healthcare and EducationAI in Service Interactions

Volltext beim Verlag öffnen

Towards Automated Error Discovery: A Study in Conversational AI

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen