OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 17.03.2026, 05:49

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Clinical AI Scribes in primary care: accuracy, error severity and implications for clinical practice

2025·2 Zitationen·BMJ Digital Health & AIOpen Access
Volltext beim Verlag öffnen

2

Zitationen

8

Autoren

2025

Jahr

Abstract

Objectives To investigate the performance of commercially available Clinical Artificial Intelligence Scribes (CAISs), assessing their accuracy, potential clinical impact of errors, and documentation quality, given growing concerns around errors and safety. Methods and analysis Seven CAIS products were investigated, using eight standardised clinical consultation scenarios recorded as audio. CAIS-generated summaries were assessed against a human-validated transcript and evaluated for errors (omissions, factual inaccuracies and hallucinations). Error severity was rated by medical doctors, generating a novel severity-weighted mpact Score (linear and exponential variants), to quantify potential clinical impact. Further analysis using the Physician Documentation Quality Instrument (PDQI-10) (a validated clinical note quality score) reinforced the findings. Results Omissions dominated error counts (83.8%, p<<0.001), with CAISs varying widely in error frequency and severity, and a median of 1–6 omissions per consultation (depending on CAIS). Although less frequent, hallucinations and factual inaccuracies were more often clinically serious. No tested CAIS produced error-free summaries. The Impact Score highlighted clinical severity, notably amplifying the significance of less frequent but high-severity errors. PDQI-10 analysis indicated summaries were weakest in succinctness and organisation, but strong in consistency and clinical usefulness. Conclusions The CAISs demonstrate high levels of summarisation accuracy. However, there is great disparity between the currently available CAIS products and, while some perform well, none are perfect. Clinicians should therefore maintain vigilance, particularly checking omitted psychosocial details and medications, and scrutinising plausible-sounding insertions. Purchasers and regulators should be aware of the significant performance disparities identified, reinforcing the need for careful evaluation and selection of CAIS products.

Ähnliche Arbeiten