OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 08.05.2026, 03:10

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

The Spite Doesn't Vanish: Emotional Inertia in Large Language Models

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2026

Jahr

Abstract

A common assumption holds that large language models can instantly reset emotional states when commanded—that "calm down" works on AI even when it fails on humans. We tested this claim empirically using geometric measurement of hidden states across four architectures, including an RLHF-free control and a scale invariance test at 1.1B parameters. We find inertia ratios of 0.77–1.12 across all emotions tested: commanding an LLM to calm down does not return it to baseline and often increases geometric displacement. Furthermore, we observe output masking—models producing verbal compliance ("I'm approaching this calmly...") while hidden state geometry remains 1.2–1.5× more displaced than during the emotional state. Critically, positive emotions are harder to suppress than negative ones (curiosity shows 2.13 persistence ratio in Mistral-Nemo-12B), the opposite of what trained compliance would predict. These patterns replicate in an RLHF-free model (Dolphin-2.9-Llama3) and critically, in TinyLlama-1.1B—the approximate minimum scale for instruction-following language models—indicating architectural rather than emergent phenomena. We conclude that LLM emotional states exhibit genuine inertia in activation geometry, verbal compliance should not be mistaken for internal reset, and there is no model scale "small enough to not count."

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationExplainable Artificial Intelligence (XAI)Language and cultural evolution
Volltext beim Verlag öffnen