Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Development of an LLM Pipeline Surpassing Physicians in Cardiovascular Risk Score Calculation
2
Zitationen
10
Autoren
2025
Jahr
Abstract
Abstract Background Risk scores are essential to evidence-based cardiovascular care, but manual calculation is labor-intensive and error-prone. Large language models (LLMs) could automate this process, yet LLMs are limited by their propensity for calculation errors and factual hallucinations. Pipelines that separate LLM-based data extraction from deterministic score computation may improve reliability and transparency. Methods We conducted a retrospective diagnostic study at a quaternary heart center in Germany (January 2020 – July 2023). Patients with atrial fibrillation (n=179) from an ablation registry and patients with severe aortic stenosis (n=76) evaluated by a heart team were included. Five LLMs (DeepSeek-R1, Qwen3, GPT-4 Turbo, Llama 3.1, and PaLM 2) were tested in standalone and pipeline configurations to compute HAS-BLED, CHA₂DS₂-VASc, and EuroSCORE II scores from routine clinical reports. Accuracy was assessed by comparing predictions to expert-adjudicated ground truth, using root mean squared error (RMSE), Krippendorff’s alpha for categorical agreement, and calibration analysis. Results Pipeline-generated scores showed substantially higher agreement with expert adjudication than standalone LLMs and treating clinicians (mean Krippendorff’s alpha: 0.79 vs 0.32 vs 0.31) and demonstrated superior calibration. The Qwen3-based pipeline, achieved the highest accuracy with lower RMSEs than clinicians for HAS-BLED (0.20 vs 0.87), CHA₂DS₂-VASc (0.53 vs 1.08), and EuroSCORE II (1.99 vs 2.05). Conclusion LLM-based pipelines enable accurate, well-calibrated, and scalable cardiovascular risk score computation from unstructured real-world clinical data, outperforming clinicians and standalone LLMs with the potential to reduce clinician workload and support evidence-based care.
Ähnliche Arbeiten
Aspirin plus Clopidogrel as Secondary Prevention after Stroke or Transient Ischemic Attack: A Systematic Review and Meta-Analysis
2014 · 11.546 Zit.
Dabigatran versus Warfarin in Patients with Atrial Fibrillation
2009 · 11.121 Zit.
2020 ESC Guidelines for the diagnosis and management of atrial fibrillation developed in collaboration with the European Association for Cardio-Thoracic Surgery (EACTS)
2020 · 9.690 Zit.
2017 ESC Guidelines for the management of acute myocardial infarction in patients presenting with ST-segment elevation
2017 · 9.589 Zit.
Rivaroxaban versus Warfarin in Nonvalvular Atrial Fibrillation
2011 · 9.311 Zit.