Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The Accuracy of ChatGPT-4o in Interpreting Chest and Abdominal X-Ray Images
8
Zitationen
8
Autoren
2025
Jahr
Abstract
<b>Background/Objectives:</b> Large language models (LLMs), such as ChatGPT, have emerged as potential clinical support tools to enhance precision in personalized patient care, but their reliability in radiological image interpretation remains uncertain. The primary aim of our study was to evaluate the diagnostic accuracy of ChatGPT-4o in interpreting chest X-rays (CXRs) and abdominal X-rays (AXRs) by comparing its performance to expert radiology findings, whilst secondary aims were diagnostic confidence and patient safety. <b>Methods</b>: A total of 500 X-rays, including 257 CXR (51.4%) and 243 AXR (48.5%), were analyzed. Diagnoses made by ChatGPT-4o were compared to expert interpretations. Confidence scores (1-4) were assigned and responses were evaluated for patient safety. <b>Results:</b> ChatGPT-4o correctly identified 345 of 500 (69%) pathologies (95% CI: 64.81-72.9). For AXRs 175 of 243 (72.02%) pathologies were correctly diagnosed (95% CI: 66.06-77.28), while for CXRs 170 of 257 (66.15%) were accurate (95% CI: 60.16-71.66). The highest detection rates among CXRs were observed for pulmonary edema, tumor, pneumonia, pleural effusion, cardiomegaly, and emphysema, and lower rates were observed for pneumothorax, rib fractures, and enlarged mediastinum. AXR performance was highest for intestinal obstruction and foreign bodies, and weaker for pneumoperitoneum, renal calculi, and diverticulitis. Confidence scores were higher for AXRs (mean 3.45 ± 1.1) than CXRs (mean 2.48 ± 1.45). All responses (100%) were considered to be safe for the patient. Interobserver agreement was high (kappa = 0.920), and reliability (second prompt) was moderate (kappa = 0.750). <b>Conclusions:</b> ChatGPT-4o demonstrated moderate accuracy for the interpretation of X-rays, being higher for AXRs compared to CXRs. Improvements are required for its use as efficient clinical support tool.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.