OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 22.04.2026, 21:18

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Human–large language model collaboration in clinical medicine: a systematic review and meta-analysis

2026·3 Zitationen·npj Digital MedicineOpen Access
Volltext beim Verlag öffnen

3

Zitationen

10

Autoren

2026

Jahr

Abstract

Human-AI collaboration (H + AI) using large language models (LLMs) offers a promising approach to enhance clinical reasoning, documentation, and interpretation tasks. Following PRISMA 2020 (PROSPERO registration: CRD420251068272), we systematically compared H + AI with human-only (H) workflows, searching four databases through June 28, 2025. Ten peer-reviewed studies met eligibility criteria, with three preprints informing sensitivity analyses only. Diagnostic/interpretation accuracy (k = 2) showed a positive trend for H + AI (Risk Ratio [RR] 1.59), but was statistically imprecise and non-significant (95% CI 0.08 to 32.74), with 95% prediction intervals (PI) crossing the null. Composite diagnostic/management scores (k = 2) showed a statistically significant improvement (Mean Difference [MD] +4.88 percentage points, 95% CI + 0.65 to +9.12), yet the PI (-31.65 to 41.42) indicates high real-world uncertainty. Time efficiency (k = 3) showed no overall difference (MD + 0.4 min, 95%CI -4.18 to +4.97; I² = 70.1%). While documentation quality improved, but factual error rates remained high (~26-36%), undermining quality gains. In three-arm settings, H + AI did not universally outperform AI-only. Evidence remains preliminary yet highly uncertain and context-dependent. We recommend preregistered, pragmatic, multicenter trials embedded in real workflows, with harmonized core outcomes that prioritize safety/error metrics and interfaces that surface uncertainty and support verification.

Ähnliche Arbeiten