Dawn Song

559 Arbeiten62.789 Zitationen

Relevante Arbeiten

Meistzitierte Publikationen im Bereich Gesundheit & MedTech

Managing extreme AI risks amid rapid progress

2024 · 238 Zit. · Science

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

2023 · 61 Zit. · arXiv (Cornell University)

Managing extreme AI risks amid rapid progress

2023 · 27 Zit. · arXiv (Cornell University)

Delving into adversarial attacks on deep policies

2017 · 24 Zit. · International Conference on Learning Representations

LLM-PBE: Assessing Data Privacy in Large Language Models

2024 · 23 Zit. · Proceedings of the VLDB Endowment

Advancing science- and evidence-based AI policy

2025 · 6 Zit. · Science

The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation

2025 · 0 Zit. · ArXiv.org