OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.04.2026, 10:30

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Automated Clinical Information Extraction from Diagnostic and Nondiagnostic Radiology Reports Using Modern Language Models

2026·0 Zitationen·Journal of Imaging Informatics in MedicineOpen Access
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

Our objective was to automate the extraction of clinical diagnoses from diagnostic and nondiagnostic radiology reports using modern language models and structured electronic health record (EHR) data. We selected venous thromboembolism (VTE) as our use case for which imaging is the gold standard but is not always fully diagnostic. We extracted venous duplex, computed tomography, and ventilation-perfusion scan reports from the Cleveland Clinic EHR system for patients admitted 2011-2020. Report ground truths were positive, negative, or nondiagnostic. We compared multiple large language models (LLMs) and bidirectional encoder representations from transformers (BERT) models on multiclass classification in holdout evaluation sets. Error analysis guided iterative LLM prompt design and maximized the detection of nondiagnostic reports. ICD-10 codes and therapeutic anticoagulation data were used to adjudicate VTE diagnoses for patients with nondiagnostic reports. We identified 82,476 radiology reports among 213,724 patients. Across models, multiclass areas under the receiver operating characteristics and precision-recall curves ranged from 0.83 to 0.96 and 0.57 to 0.94. The most accurate model, Llama-3.3, detected 95% of VTE-positive reports with a precision of 99.6% and detected 87% of nondiagnostic reports with a precision of 88%. The positive detection rate increased to 98% when we paired structured EHR variables with minimal chart review (0.7% of the evaluation set) to adjudicate diagnoses for patients with nondiagnostic reports. In summary, Llama-3.3 was highly sensitive and specific for positive VTE diagnoses and nondiagnostic radiology reports. We integrated an LLM, structured EHR variables, and limited chart review for successful management of diagnostic uncertainty in automated information extraction from radiology reports.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Radiology practices and educationArtificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic Skills
Volltext beim Verlag öffnen