Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Developing and Testing an Engineering Framework for Curiosity-Driven and Humble AI in Clinical Decision Support
0
Zitationen
19
Autoren
2026
Jahr
Abstract
Abstract Background We present BODHI (Balanced, Open-minded, Diagnostic, Humble, and Inquisitive), an engineering framework for curiosity-driven and humble clinical decision support AI. Despite growing capabilities, large language models (LLMs) often express inappropriate confidence, conflating statistical pattern recognition with genuine medical understanding. BODHI addresses this through a dual-reflective architecture that: (1) decomposes epistemic uncertainty into task-specific dimensions, and (2) constrains model responses using virtue-based stance rules derived from a Virtue Activation Matrix. Methods We validate the framework through controlled evaluation on 200 clinical vignettes from HealthBench Hard, assessing GPT-4o-mini and GPT-4.1-mini across 5 random seeds (1,800 total observations). Statistical analysis included bootstrap resampling, paired t-tests, and effect size computation (Supplementary Materials S3) Findings BODHI significantly improved overall clinical response quality (GPT-4.1-mini: +17.3pp, p < 0.0001, Cohen’s d = 0.50; GPT-4o-mini: +7.4pp, p < 0.0001, Cohen’s d = 0.22) while achieving very large effect sizes on curiosity (context-seeking rate: Cohen’s d = 16.38 and 19.54) and humility (hedging: d = 5.80 for GPT-4.1-mini) metrics. Crucially, 97.3% of GPT-4.1-mini responses and 73.5% of GPT-4o-mini responses included appropriate clarifying questions, compared to 7.8% and 0.0% at baseline, demonstrating the framework’s effectiveness in eliciting information-gathering behavior. Interpretation These findings suggest LLMs can be reliably constrained to operate within epistemic boundaries when provided with structured uncertainty decomposition and virtue-aligned response rules, offering a pathway toward safer clinical AI deployment.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.
Autoren
Institutionen
- Centre National de la Recherche Scientifique(FR)
- Inserm(FR)
- The University of Melbourne(AU)
- Sorbonne Université(FR)
- Assistance Publique – Hôpitaux de Paris(FR)
- Centre for Eye Research Australia(AU)
- Institut du Cerveau(FR)
- Massachusetts Institute of Technology(US)
- Harvard University(US)
- Hadassah Medical Center(IL)
- Fundación Valle del Lili(CO)
- Icesi University(CO)
- University College London(GB)
- Cambridge University Hospitals NHS Foundation Trust(GB)
- The Centre for Health (New Zealand)(NZ)
- Mbarara University of Science and Technology(UG)
- King's College London(GB)
- University of Bergen(NO)
- Titanium Metals Corporation (United Kingdom)(GB)
- ETH Zurich(CH)
- ZHAW Zurich University of Applied Sciences(CH)
- University of Zurich(CH)