Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Can Open-Source Large Language Models Detect Medical Errors in Real-World Ophthalmology Reports?

2025·0 Zitationen·AIOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Accurate documentation is critical in ophthalmology, yet clinical notes often contain subtle errors that can affect decision-making. This study prospectively compared contemporary large language models (LLMs) for detecting clinically salient errors in emergency ophthalmology encounter notes and generating actionable corrections. 129 de-identified notes, each seeded with a predefined target error, were independently audited by four LLMs (o3 (OpenAI, closed-source), DeepSeek-v3-r1 (Deepseek, open-source), MedGemma-27B (Google, open-source), and GPT-4o (OpenAI, closed-source)) using a standardized prompt. Two masked ophthalmologists graded error localization, relevance of additional issues, and overall recommendation quality, with within-case analyses applying appropriate nonparametric tests. Performance varied significantly across models (Cochran’s Q = 71.13, p = 2.44 × 10−15). o3 achieved the highest error localization accuracy at 95.7% (95% CI, 89.5–98.8), followed by DeepSeek-v3-r1 (90.3%), MedGemma-27b (80.9%), and GPT-4o (53.2%). Ordinal outcomes similarly favored o3 and DeepSeek-v3-r1 (both p < 10−9 vs. GPT-4o), with mean recommendation quality scores of 3.35, 3.05, 2.54, and 2.11, respectively. These findings demonstrate that LLMs can serve as accurate “second-eyes” for ophthalmology documentation. A proprietary model led on all metrics, while a strong open-source alternative approached its performance, offering potential for privacy-preserving on-premise deployment. Clinical translation will require oversight, workflow integration, and careful attention to ethical considerations.

Autoren

Institutionen

University of Split(HR)

Themen

Artificial Intelligence in Healthcare and EducationElectronic Health Records SystemsRetinal Imaging and Analysis

Volltext beim Verlag öffnen

Can Open-Source Large Language Models Detect Medical Errors in Real-World Ophthalmology Reports?

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen