Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
From Concept to Clinic: Real World Evidence for Autonomous AI Deployment in Primary Care Telemedicine
0
Zitationen
5
Autoren
2026
Jahr
Abstract
Abstract Systems powered by large language models are widely used for health information and advice, yet robust evidence for their safety and effectiveness in real-world clinical care remains lacking. Most existing studies evaluate general-purpose chatbots in artificial settings, failing to account for the critical role of system design, deployment context, and integrated safety mechanisms. Here, we report, to our knowledge, the first large-scale, clinician-blinded, real-world evaluation of a multiagent LLM-based system deployed within a nationwide U.S. primary care telemedicine platform, assessing readiness for task-specific autonomous deployment. In 2,379 real patient encounters, where users actively sought medical care and completed full visits with licensed clinicians, we compared the AI system’s intake diagnoses and disposition suggestions to those of treating clinicians, who were blinded to the AI’s outputs. The AI’s top-1 diagnosis matched the clinician’s diagnosis in 91.3% of cases overall, increasing to 96.3% among cases meeting a pre-specified safety confidence threshold, and 97.9% in common, lower-complexity conditions that met the same confidence threshold. Disposition accuracy was similarly high, with an overall error rate of 2.5% and no errors in suggestions to emergency room or home management. These results demonstrate that purposeful system architecture, rather than model capability alone, is essential for safe and effective autonomous clinical AI. We propose a staged, task-calibrated deployment framework, in which AI can be introduced autonomously for well-defined tasks with explicit safety gating and continuous monitoring, expanding scope as real-world evidence accrues. Our findings provide the first real-world evidence of readiness for safe autonomous clinical AI and offer a practical roadmap for its responsible deployment at scale.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.551 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.443 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.942 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.792 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.