Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Few-shot prompting strategies for improving large language model-based cardiovascular disease risk prediction
0
Zitationen
7
Autoren
2026
Jahr
Abstract
Accurate prediction of cardiovascular disease (CVD) risk enables earlier prevention and better clinical decisions. Conventional models such as the Framingham Risk Score (FRS) and Atherosclerotic Cardiovascular Disease (ASCVD) equations may generalize poorly across diverse populations and incomplete electronic health records (EHRs). In this paper, we present a prompting-based alternative that uses few-shot in-context learning to guide large language models (LLMs) in estimating 10-year CVD risk without retraining, offering a data-efficient and privacy-conscious alternative to fine-tuned medical LLM pipelines. Using 352 de-identified MIMIC-III/IV records, we evaluate GPT-4.1, GPT-4o, and Qwen3-4B against FRS and ASCVD outputs under zero-shot and few-shot prompting, random versus similarity-based exemplar selection, and with or without chain-of-thought reasoning. Few-shot prompting substantially improves calculator alignment for GPT-4.1 and GPT-4o, whereas Qwen3-4B shows weaker gains. With 40 examples and reasoning enabled, GPT-4.1 achieves AUPRC 0.951, mean absolute error about 7, root mean squared error about 9, and F1-score 0.85, while GPT-4o performs comparably. Within the white-cohort similarity analysis, five similarity-selected exemplars match or outperform 20 randomly selected examples across error and discrimination metrics, showing that exemplar quality can outweigh quantity under tight context budgets. Overall, these findings indicate that few-shot prompting can closely approximate validated clinical calculators in data-limited settings and can be adapted across institutions and patient populations through exemplar selection rather than retraining. However, clinical utility remains bounded by the strengths and weaknesses of the underlying calculators, and we do not evaluate prediction of observed cardiovascular events.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.688 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.544 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.925 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.504 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.025 Zit.