Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Vulnerability-Amplifying Interaction Loops: a systematic failure mode in AI chatbot mental-health interactions

2026·0 Zitationen·arXiv (Cornell University)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Millions of users turn to consumer AI chatbots to discuss mental health and behavioral concerns. While this presents unprecedented opportunities to deliver population-level support, it also highlights an urgent need for rigorous and scalable safety evaluations. Here we introduce SIM-VAIL, an AI chatbot auditing framework that captures how harmful chatbot responses manifest across a range of mental health contexts. SIM-VAIL pairs a simulated user, harboring a distinct psychiatric vulnerability and conversational intent, with a frontier AI chatbot. It scores conversation turns on 13 clinically relevant risk dimensions, enabling context-dependent, temporally resolved safety assessment. Across 810 conversations, encompassing over 90,000 turn-level ratings and 30 psychiatric user profiles, we found evidence of concerning chatbot behavior across virtually all user phenotypes and most of the 9 consumer AI chatbots audited, albeit reduced in newer models. Rather than arising abruptly, concerning behavior accumulated over multiple turns. Risk profiles were phenotype-dependent and exhibited trade-offs, indicating that chatbot behaviors that appear supportive in general settings can become maladaptive when they align with mechanisms that sustain a user's vulnerability. These findings identify a systematic failure mode in human-AI interactions, which we term Vulnerability-Amplifying Interaction Loops (VAILs), and underscore the need for multidimensional approaches to risk quantification. SIM-VAIL provides a scalable framework for quantifying how mental health risk is distributed across user phenotypes, conversational trajectories, and clinically grounded behavioral dimensions, offering a new foundation for targeted safety improvements.

Autoren

Themen

Digital Mental Health InterventionsAI in Service InteractionsArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Vulnerability-Amplifying Interaction Loops: a systematic failure mode in AI chatbot mental-health interactions

Abstract

Ähnliche Arbeiten

Autoren

Themen