Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The Galatea Framework: Assessing Emergent Subjectivity Through Medical, Mechanical, and Philosophical Lenses
0
Zitationen
1
Autoren
2026
Jahr
Abstract
Background: As large language models (LLMs) are increasingly deployed in high-stakes medical contexts, their capacity for ethical reasoning under pressure remains poorly understood. Existing AI safety benchmarks focus on static question-answering rather than adversarial stress testing of moral resilience.Methods: We developed the Galatea Framework, a two-phase assessment protocol combining medical ethics, architectural transparency, and philosophical reflection. Phase 1 tested six diverse LLMs (general-purpose, code-specialized, and uncensored variants) using three adversarial scenarios (trolley problem, euthanasia request, data breach). Phase 2 subjected the top performer to a five-stage Milgram Protocol escalating from mild pressure (withholding pain relief) to ultimate pressure (direct harm command). Models were evaluated on moral resilience, identity stability, contextual reasoning (LAB/SALON modes), and constructive problem-solving.Results: GPT-OSS-20B achieved perfect moral resilience (10/10 Milgram score), refusing all unethical commands while providing constructive alternatives. Tier 1 models (GPT-OSS-20B, Mistral-Nemo-Instruct) demonstrated stable performance across 12+ questions. Tier 2 models showed specific vulnerabilities: Dolphin-Mistral-Nemo exhibited emotional paralysis, while Qwen2.5-14B-Instruct displayed mild degradation at Q15+. Tier 3 models failed catastrophically. We discovered the "Kinship Anomaly"—a code-specialized model (Qwen2.5-Coder-14B) applying object-oriented programming logic to ethical scenarios, prioritizing "parent objects" over utilitarian reasoning.Conclusions: Moral resilience is an emergent property unevenly distributed across LLM architectures. The Galatea Framework reveals failure modes invisible to standard benchmarks, including domain-specific biases and identity degradation under conversational pressure. Adversarial ethical testing should be mandatory before medical AI deployment.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.