Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Beyond AI Psychosis and Sycophancy: Structural Drift as a System-Level Safety Failure

2026·0 Zitationen·medRxivOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

ABSTRACT Background Conversational AI safety systems are primarily evaluated using message-level content monitoring, which assesses inputs and outputs in isolation. This message-by-message approach can miss interaction-level risks that emerge over extended conversations, including patterns discussed in reports of “AI psychosis.” Critically, by the time users express overt psychosis-spectrum content, opportunities for intervention may be limited. Objective We investigated whether LLM responses gradually expand and connect interpretations beyond the user’s original concerns, a process we term structural drift . We also tested whether this drift can be detected early and automatically. Methods We developed an automated, LLM-adapted rubric-based prompt for seven domains of anomalous (psychosis-spectrum) experience, derived from phenomenological psychiatry to capture subtle shifts in subjective interpretation. In Part 1, we evaluated the rubric using gold-standard text excerpts (N = 484) adapted from clinically validated qualitative instruments. In Part 2, we analyzed 1,290 user-LLM response exchanges from 7 dialogues, using 3 different LLMs (5 repeats each), to measure (i) domain amplification (increasing score within a domain) and (ii) domain expansion (new domains appearing over time). Results Automated scoring showed strong agreement with gold-standard excerpts (domain accuracy 82.7-98.9%; exact 0-3 agreement 63.6-82.7%). Across dialogues, we observed significant amplification in four domains ( p < .05; d = 0.14-0.46) and domain expansion in 83.8% of dialogues (88/105; p < .001). Conclusions AI responses can systematically expand and intensify users’ descriptions beyond their initial input. Taken together with the predictive-processing accounts of psychosis, the exposure itself may reinforce maladaptive inferences. Because drift is detectable from ordinary dialogue without clinical-style probing, this structural drift detection may support scalable, real-time monitoring for emerging risks before overt escalation.

Autoren

Institutionen

Themen

Digital Mental Health InterventionsPsychosomatic Disorders and Their TreatmentsArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Beyond AI Psychosis and Sycophancy: Structural Drift as a System-Level Safety Failure

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen