Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Orchestrated multi agents sustain accuracy under clinical-scale workloads compared to a single agent
2
Zitationen
11
Autoren
2025
Jahr
Abstract
We tested state-of-the-art large language models (LLMs) in two configurations for clinical-scale workloads: a single agent handling heterogeneous tasks versus an orchestrated multi-agent system assigning each task to a dedicated worker. Across retrieval, extraction, and dosing calculations, we varied batch sizes from 5 to 80 to simulate clinical traffic. Multi-agent runs maintained high accuracy under load (pooled accuracy 90.6% at 5 tasks, 65.3% at 80) while single-agent accuracy fell sharply (73.1% to 16.6%), with significant differences beyond 10 tasks (FDR-adjusted p < 0.01). Multi-agent execution reduced token usage up to 65-fold and limited latency growth compared with single-agent runs. The design's isolation of tasks prevented context interference and preserved performance across four diverse LLM checkpoints. This is the first evaluation of LLM agent architectures under sustained, mixed-task clinical workloads, showing that lightweight orchestration can deliver accuracy, efficiency, and auditability at operational scale.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.204 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.582 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.382 Zit.