OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 23:34

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

The Galatea Framework: Assessing Emergent Subjectivity Through Medical, Mechanical, and Philosophical Lenses

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

Background: As large language models (LLMs) are increasingly deployed in high-stakes medical contexts, their capacity for ethical reasoning under pressure remains poorly understood. Existing AI safety benchmarks focus on static question-answering rather than adversarial stress testing of moral resilience.Methods: We developed the Galatea Framework, a two-phase assessment protocol combining medical ethics, architectural transparency, and philosophical reflection. Phase 1 tested six diverse LLMs (general-purpose, code-specialized, and uncensored variants) using three adversarial scenarios (trolley problem, euthanasia request, data breach). Phase 2 subjected the top performer to a five-stage Milgram Protocol escalating from mild pressure (withholding pain relief) to ultimate pressure (direct harm command). Models were evaluated on moral resilience, identity stability, contextual reasoning (LAB/SALON modes), and constructive problem-solving.Results: GPT-OSS-20B achieved perfect moral resilience (10/10 Milgram score), refusing all unethical commands while providing constructive alternatives. Tier 1 models (GPT-OSS-20B, Mistral-Nemo-Instruct) demonstrated stable performance across 12+ questions. Tier 2 models showed specific vulnerabilities: Dolphin-Mistral-Nemo exhibited emotional paralysis, while Qwen2.5-14B-Instruct displayed mild degradation at Q15+. Tier 3 models failed catastrophically. We discovered the "Kinship Anomaly"—a code-specialized model (Qwen2.5-Coder-14B) applying object-oriented programming logic to ethical scenarios, prioritizing "parent objects" over utilitarian reasoning.Conclusions: Moral resilience is an emergent property unevenly distributed across LLM architectures. The Galatea Framework reveals failure modes invisible to standard benchmarks, including domain-specific biases and identity degradation under conversational pressure. Adversarial ethical testing should be mandatory before medical AI deployment.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAdversarial Robustness in Machine LearningExplainable Artificial Intelligence (XAI)
Volltext beim Verlag öffnen