Hongliu Cao
26 Arbeiten171 Zitationen
Relevante Arbeiten
Meistzitierte Publikationen im Bereich Gesundheit & MedTech
Multi-Agent LLM Judge: automatic personalized LLM judge design for evaluating natural language generation applications
2025 · 0 Zit. · ArXiv.org
Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation
2026 · 0 Zit. · arXiv (Cornell University)
Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation
2026 · 0 Zit. · arXiv (Cornell University)