Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Large Language Models in Cardiovascular Imaging: Current Applications and Future Prospects

2025·2 Zitationen·Med ResearchOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Although large language models (LLMs) have demonstrated significant potential in medical imaging, their applications in cardiovascular imaging remain in relatively early stages of exploration [1]. The four core modalities of cardiovascular imaging—cardiac magnetic resonance (CMR), coronary computed tomography angiography (CCTA), echocardiography, and nuclear medicine imaging—each possess unique advantages and functionally complement one another, providing differentiated application scenarios for LLMs (Figure 1, Table 1). CMR enables automated report generation and precise quantitative analysis, whereas CCTA excels in calcium scoring computation and intelligent plaque characterization. Echocardiography leverages its real-time imaging capabilities to achieve breakthroughs in standardized view recognition and hemodynamic parameter extraction. Nuclear medicine imaging demonstrates particular strengths through quantitative perfusion defect analysis and intelligent correlation with clinical symptoms. Schematic overview of large language model (LLM) applications across major cardiovascular imaging modalities. Four key imaging techniques were compared: cardiac magnetic resonance (CMR), coronary computed tomography angiography (CCTA), echocardiography, and nuclear medicine. For each modality, the key technical advantage is highlighted, followed by specific examples of LLM implementations aimed at enhancing workflow, quantification, and interpretation. Assisted myocarditis diagnosis; HCM phenotypic analysis; Automated report generation; Patient communication optimization. Diagnostic accuracy comparable to junior radiologists; High phenotypic classification accuracy. CAD-RADS intelligent scoring; Precise plaque identification; Clinical decision support. High consistency in automated scoring; Excellent plaque identification accuracy; Effectively supports revascularization decisions. Intelligent report generation, Multicenter data integration; Long-term follow-up analysis; Intraoperative real-time assistance. Report quality reaches clinical expert level; local deployment ensures data security; Integrated frameworks significantly improve diagnostic efficiency. Examination protocol optimization; Intelligent risk stratification; Multimodal data correlation analysis. Text comprehension reaches expert level; High accuracy in protocol selection; reliable risk stratification. Although LLMs, such as GPT-4, show promise in CMR matching junior radiologists in myocarditis diagnosis, their performance is highly data-dependent. The persistent gap with senior experts underscores the irreplaceable value of clinical expertise [2]. For hypertrophic cardiomyopathy, specialized natural language processing systems extract 21 phenotypic features, including mitral regurgitation and left atrial enlargement, with categorical accuracy exceeding 93%, whereas numerical parameters, such as left ventricular ejection fraction, attain 99% reliability, surpassing manual annotation consistency [3]. Bidirectional encoder representations from transformer frameworks revolutionize labeling through semisupervised learning achieving 86% F1 scores [4]. GPT-4 improves patient communication by simplifying jargon but risks oversimplifying complex findings such as translating “transmural late gadolinium enhancement” to “heart scar.” [5]. Moreover, the visual language model excels in predicting arrhythmia risk in hypertrophic cardiomyopathy by fusing CMR with clinical text, achieving AUCs of 0.89 (internal) and 0.81 (external)—surpassing clinical guidelines by 0.22–0.35 [6]. It enables synergistic visual-text understanding while maintaining generalizability, fairness, and explainability, serving as an effective multimodal clinical aid. Despite high accuracy in automated CAD-RADS scoring, GPT-4o′s tendency to overestimate risk could lead to unnecessary interventions, underscoring a critical clinical safety concern [7]. Custom natural language processing (NLP) pipelines process over 30,000 reports identifying plaques at 95% accuracy but fail to recognize stent or coronary artery bypass grafting modifiers in 67% of cases—errors escalating 40% during complex report interpretation [8, 9]. Although AI guidance can significantly reduce unnecessary procedures in cardiac care, its reliability is fundamentally limited by the quality of the original imaging data not just its own analytical power [10]. Critically, the absence of visual feature interpretation remains the pivotal gap, particularly regarding lipid-core plaques under 30 Hounsfield units which limits personalized risk assessment. The integration of AI and LLMs into echocardiography has enhanced clinical efficiency and diagnostic precision. A pivotal advancement is the development of specialized LLMs, such as EchoGPT [11], which leverages fine-tuned open-source architectures, such as Llama-2-7B, with quantized low-rank adaptation to automate the summarization of detailed “Findings” into concise “Impressions” achieving good accuracy and matching cardiologist-level conciseness; this significantly reduces reporting burdens while maintaining clinical rigor. Beyond summarization, LLMs now excel at structured data extraction from unstructured reports, as demonstrated by HeartDX-LM [12], which transforms free-text echocardiograms into tabular datasets for clinical variables. Privacy concerns are addressed by locally deploying open-source models [13, 14], which eliminate cloud risks while maintaining expert-level performance in long-term temporal analysis and severity assessment [15]. Furthermore, ensemble frameworks unify multiple LLMs via supermajority voting to automate intraoperative reporting with high accuracy [16], establishing them as indispensable tools for redefining echocardiography workflows. Recent evaluations show that LLMs, such as GPT-4o, perform strongly in cardiovascular nuclear medicine tasks such as protocol selection and risk stratification. When tested on board-style exam materials, GPT-4o outperformed other models and approached human-level diagnostic accuracy, demonstrating its potential as a decision-support tool [17]. However, all models showed notable limitations in image-based interpretation, struggling with tasks, such as correlating nuclear medicine imaging findings with clinical features, indicating unresolved challenges in multimodal understanding. Time-stability testing revealed consistent performance across most domains, though isolated degradation in image-intensive sections underscores persistent vulnerabilities in handling visual data. Current limitations center on three interconnected frontiers: stringent data dependency that degrades performance with incomplete inputs exemplified by missing T1/T2 mapping sequences reducing diagnostic accuracy by 7%–21% [2, 4], modality isolation preventing synthesis of imaging biomarkers such as plaque vulnerability features [7, 9]. Future progress depends on multimodal AI-NLP architectures capable of intelligent report generation and precise uncertainty quantification. Despite these strides, significant hurdles impede seamless clinical integration, primarily concerning evaluation reliability, interoperability, and error management. Temporal analysis capabilities remain underdeveloped, with models scoring only 8.2 RadGraph F1 for longitudinal comparisons [11]; finally, ethical risks and the lack of standardized benchmarks perpetuate implementation uncertainties, necessitating collaborative efforts to establish clinical validation protocols. The integration of LLMs across cardiovascular imaging modalities promises transformative advancements, contingent upon overcoming significant technical and ethical hurdles. Key developments include the evolution of LLMs in nuclear medicine to provide comprehensive interpretation of quantitative perfusion data by synergizing with deep learning systems, and their application in cardiac magnetic resonance for automated tissue characterization that correlates fibrosis patterns with functional parameters. In coronary computed tomography angiography, the fusion of anatomical plaque assessment with clinical risk prediction will enable personalized protocol optimization, whereas echocardiography will leverage LLMs for real-time procedural guidance through dynamic interpretation of Doppler flow patterns, utilizing federated learning to ensure data privacy. Realizing this potential mandates the creation of harmonized multimodal datasets and robust validation pathways incorporating clinician-in-the-loop benchmarking. Crucially, a rigorous ethical framework is imperative, requiring advanced de-identification techniques with data provenance tracking, real-time uncertainty quantification to manage diagnostic errors, and algorithmic transparency standards. The pace of adoption ultimately depends on resolving technical barriers in multimodal data fusion and interoperability, while simultaneously addressing these ethical considerations through collaborative efforts to establish standardized clinically-grounded validation protocols. The authors have nothing to report.

Autoren

Institutionen

Themen

Cardiac Imaging and DiagnosticsArtificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical Imaging

Volltext beim Verlag öffnen

Large Language Models in Cardiovascular Imaging: Current Applications and Future Prospects

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen