Dylan Hadfield-Menell
86 Arbeiten1.470 Zitationen
Relevante Arbeiten
Meistzitierte Publikationen im Bereich Gesundheit & MedTech
Pitfalls of Evidence-Based AI Policy
2025 · 2 Zit. · SuperIntelligence - Robotics - Safety & Alignment
Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
2022 · 1 Zit. · arXiv (Cornell University)
Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
2023 · 0 Zit. · arXiv (Cornell University)
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
2025 · 0 Zit. · ArXiv.org
Pitfalls of Evidence-Based AI Policy
2025 · 0 Zit. · ArXiv.org