Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

AI in the Hot Seat: Head-to-Head Comparison of Large Language Models and Cardiologists in Emergency Scenarios

2026·0 Zitationen·Medical SciencesOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Background: The clinical applicability of large language models (LLMs) in high-stakes cardiac emergencies remains unexplored. This study evaluated how well advanced LLMs perform in managing complex catheterization laboratory (Cath lab) scenarios and compared their performance with that of interventional cardiologists. Methods and Results: A cross-sectional study was conducted from 20 June to 2 December 2024. Twelve challenging inferior myocardial infarction scenarios were presented to seven LLMs (ChatGPT, Gemini, LLAMA, Qwen, Bing, Claude, DeepSeek) and five early-career interventional cardiologists. Responses were standardized, anonymized, and evaluated by thirty experienced interventional cardiologists. Performance comparisons were analyzed using a linear mixed-effects model with correlation and reliability statistics. Physicians had an average reference score of 80.68 (95% CI 76.3-85.0). Among LLMs, ChatGPT ranked highest (87.4, 95% CI 82.5-92.3), followed by Claude (80.8, 95% CI 75.7-85.9) and DeepSeek (78.7, 95% CI 72.9-84.6). LLAMA (73.7), Qwen (66.2), and Bing (64.3) ranked lower, while Gemini scored the lowest (59.0). ChatGPT scored higher than the early-career physician comparator group (difference 6.69, 95% CI 0.00-13.37; p < 0.05), whereas Gemini, LLAMA, Qwen, and Bing performed significantly worse; Claude and DeepSeek showed no significant difference. Conclusions: This expanded assessment reveals significant variability in LLM performance. In this simulated setting, ChatGPT demonstrated performance comparable to that of early-career interventional cardiologists. These results suggest that LLMs could serve as supplementary decision-support tools in interventional cardiology under simulated conditions.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareSepsis Diagnosis and Treatment

Volltext beim Verlag öffnen

AI in the Hot Seat: Head-to-Head Comparison of Large Language Models and Cardiologists in Emergency Scenarios

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen