OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 04.04.2026, 01:15

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

ChatGPT vs Gemini: Comparative Accuracy and Efficiency in CAD-RADS Score Assignment from Radiology Reports

2024·15 Zitationen·Journal of Imaging Informatics in MedicineOpen Access
Volltext beim Verlag öffnen

15

Zitationen

9

Autoren

2024

Jahr

Abstract

This study aimed to evaluate the accuracy and efficiency of ChatGPT-3.5, ChatGPT-4o, Google Gemini, and Google Gemini Advanced in generating CAD-RADS scores based on radiology reports. This retrospective study analyzed 100 consecutive coronary computed tomography angiography reports performed between March 15, 2024, and April 1, 2024, at a single tertiary center. Each report containing a radiologist-assigned CAD-RADS score was processed using four large language models (LLMs) without fine-tuning. The findings section of each report was input into the LLMs, and the models were tasked with generating CAD-RADS scores. The accuracy of LLM-generated scores was compared to the radiologist's score. Additionally, the time taken by each model to complete the task was recorded. Statistical analyses included Mann-Whitney U test and interobserver agreement using unweighted Cohen's Kappa and Krippendorff's Alpha. ChatGPT-4o demonstrated the highest accuracy, correctly assigning CAD-RADS scores in 87% of cases (κ = 0.838, α = 0.886), followed by Gemini Advanced with 82.6% accuracy (κ = 0.784, α = 0.897). ChatGPT-3.5, although the fastest (median time = 5 s), was the least accurate (50.5% accuracy, κ = 0.401, α = 0.787). Gemini exhibited a higher failure rate (12%) compared to the other models, with Gemini Advanced slightly improving upon its predecessor. ChatGPT-4o outperformed other LLMs in both accuracy and agreement with radiologist-assigned CAD-RADS scores, though ChatGPT-3.5 was significantly faster. Despite their potential, current publicly available LLMs require further refinement before being deployed for clinical decision-making in CAD-RADS scoring.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingRadiology practices and education
Volltext beim Verlag öffnen