OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 30.03.2026, 08:27

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Explaining GPTs' Schema of Depression: A Machine Behavior Analysis

2024·1 Zitationen·arXiv (Cornell University)Open Access
Volltext beim Verlag öffnen

1

Zitationen

12

Autoren

2024

Jahr

Abstract

Use of large language models such as ChatGPT (GPT-4/GPT-5) for mental health support has grown rapidly, emerging as a promising route to assess and help people with mood disorders like depression. However, we have a limited understanding of these language models' schema of mental disorders, that is, how they internally associate and interpret symptoms of such disorders. In this work, we leveraged contemporary measurement theory to decode how GPT-4 and GPT-5 interrelate depressive symptoms, providing an explanation of how LLMs apply what they learn and informing clinical applications. We found that GPT-4 (a) had strong convergent validity with standard instruments and expert judgments $(r = 0.70 - 0.81)$, and (b) behaviorally linked depression symptoms with each other (symptom inter-correlates $r = 0.23 - 0.78$) in accordance with established literature on depression; however, it (c) underemphasized the relationship between $\textit{suicidality}$ and other symptoms while overemphasizing $\textit{psychomotor symptoms}$; and (d) suggested novel hypotheses of symptom mechanisms, for instance, indicating that $\textit{sleep}$ and $\textit{fatigue}$ are broadly influenced by other depressive symptoms, while $\textit{worthlessness/guilt}$ is only tied to $\textit{depressed mood}$. GPT-5 showed a slightly lower convergence with self-report, a difference our machine-behavior analysis makes interpretable through shifts in symptom-symptom relationships. These insights provide an empirical foundation for understanding language models' mental health assessments and demonstrate a generalizable approach for explainability in other models and disorders. Our findings can guide key stakeholders to make informed decisions for effectively situating these technologies in the care system.

Ähnliche Arbeiten