Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance Evaluation of Large Language Models With Retrieval-Augmented Generation in Cardiology Specialist Examinations in Japan

2025·0 Zitationen·Circulation ReportsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Background: Large language models (LLMs) have shown potential in medical education, but their application to cardiology specialist examinations remains underexplored. We compared the performances of a retrieval-augmented generation LLM (RAG-LLM) 'CardioCanon' against general-purpose LLMs. Methods and Results: A total of 96 publicly available text-based open-source multiple-choice questions from the Japanese Cardiology Specialist Examination (1997-2022) were used. CardioCanon showed similar option-level accuracy to ChatGPT-4o and Gemini 2.0 Flash (81.0%, 76.0%, and 77.2%, respectively), but higher case-based accuracy than ChatGPT (57.3% vs. 29.2%, P<0.001). Conclusions: RAG techniques can enhance AI-assisted examination performance by improving case-level reasoning and decision-making.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationCOVID-19 diagnosis using AIMachine Learning in Healthcare

Volltext beim Verlag öffnen

Performance Evaluation of Large Language Models With Retrieval-Augmented Generation in Cardiology Specialist Examinations in Japan

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen