Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Clinical AI Scribes in primary care: accuracy, error severity and implications for clinical practice
2
Zitationen
8
Autoren
2025
Jahr
Abstract
Objectives To investigate the performance of commercially available Clinical Artificial Intelligence Scribes (CAISs), assessing their accuracy, potential clinical impact of errors, and documentation quality, given growing concerns around errors and safety. Methods and analysis Seven CAIS products were investigated, using eight standardised clinical consultation scenarios recorded as audio. CAIS-generated summaries were assessed against a human-validated transcript and evaluated for errors (omissions, factual inaccuracies and hallucinations). Error severity was rated by medical doctors, generating a novel severity-weighted mpact Score (linear and exponential variants), to quantify potential clinical impact. Further analysis using the Physician Documentation Quality Instrument (PDQI-10) (a validated clinical note quality score) reinforced the findings. Results Omissions dominated error counts (83.8%, p<<0.001), with CAISs varying widely in error frequency and severity, and a median of 1–6 omissions per consultation (depending on CAIS). Although less frequent, hallucinations and factual inaccuracies were more often clinically serious. No tested CAIS produced error-free summaries. The Impact Score highlighted clinical severity, notably amplifying the significance of less frequent but high-severity errors. PDQI-10 analysis indicated summaries were weakest in succinctness and organisation, but strong in consistency and clinical usefulness. Conclusions The CAISs demonstrate high levels of summarisation accuracy. However, there is great disparity between the currently available CAIS products and, while some perform well, none are perfect. Clinicians should therefore maintain vigilance, particularly checking omitted psychosocial details and medications, and scrutinising plausible-sounding insertions. Purchasers and regulators should be aware of the significant performance disparities identified, reinforcing the need for careful evaluation and selection of CAIS products.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.