OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.03.2026, 18:22

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Unveiling large multimodal models in pulmonary CT: A comparative assessment of generative AI performance in lung cancer diagnostics

2025·1 Zitationen·ViewOpen Access
Volltext beim Verlag öffnen

1

Zitationen

18

Autoren

2025

Jahr

Abstract

Abstract Introduction: The emerging generative artificial intelligence (Gen‐AI) is increasingly recognized for its potential in healthcare, particularly in complex radiological interpretations. However, the clinical utility of Gen‐AI requires thorough validation with real‐world data. Method: This retrospective study analyzed chest computed tomography (CT) scans from 404 patients with lung conditions with lung neoplasms ( n = 184) and non‐malignancy ( n = 210), incorporating The Cancer Genome Atlas ( n = 106) and Medical Imaging and Data Resource Center ( n = 110) datasets as external validation. We evaluated diagnostic performance of three Gen‐AI models (GPT‐4‐turbo, Gemini‐pro‐vision, and Claude‐3‐opus) using receiver operating characteristic (ROC) analysis and chi‐square tests across various clinical scenarios. Likert scale scoring combined with response rate and variance analysis were employed to evaluate internal diagnostic tendencies, while Lasso and stepwise regression were externally introduced to optimize model performance. Results: In single‐image CT diagnostics, Gemini and Claude demonstrated superior accuracy compared to GPT. However, when additional CT slices or clinical histories were incorporated, the diagnostic accuracy of all models declined. ROC analysis indicated that Gen‐AI performance was limited but improved in simplified prompting environments or integration with machine learning methods. Feature analysis revealed that Gen‐AI primarily relied on morphology and margins for malignancy predictions, but struggled to recognize critical imaging features and occasionally fabricated data. Conclusions: Gen‐AI demonstrated variable potential for pulmonary CT imaging diagnosis across prompts and diagnostic environments of differing complexity. However, their limitations and risks in processing complex multimodal information highlight significant challenges in the integration of clinical information by existing models. Ongoing efforts to improve the robustness and reliability of these models are crucial for their successful adoption in healthcare.

Ähnliche Arbeiten