Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Diagnostic performance of ChatGPT in tibial plateau fracture in knee X-ray

2024·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

<title>Abstract</title> Purpose Tibial plateau fractures are relatively common and require accurate diagnosis. Chat Generative Pre-Trained Transformer (ChatGPT) has emerged as a tool to improve medical diagnosis. This study aims to investigate the accuracy of this tool in diagnosing tibial plateau fractures. Methods A secondary analysis was performed on 111 knee radiographs from emergency department patients, with 29 confirmed fractures by computed tomography (CT) imaging. The X-rays were reviewed by a board-certified emergency physician (EP) and radiologist and then analyzed by ChatGPT-4 and ChatGPT-4o. The diagnostic performances were compared using the area under the receiver operating characteristic curve (AUC). Sensitivity, specificity, and likelihood ratios were also calculated. Results The results indicated a sensitivity and negative likelihood ratio of 58.6% (95% CI: 38.9% − 76.4%) and 0.4 (95% CI: 0.3–0.7) for the EP, 72.4% (95% CI: 52.7% − 87.2%) and 0.3 (95% CI: 0.2–0.6) for the radiologist, 27.5% (95% CI: 12.7% − 47.2%) and 0.7 (95% CI: 0.6–0.9)for ChatGPT-4, and 55.1% (95% CI: 35.6% − 73.5%) and 0.4 (95% CI: 0.3–0.7) for ChatGPT4o. The specificity and positive likelihood ratio were 85.3% (95% CI: 75.8% − 92.2%) and 4.0 (95% CI: 2.1–7.3) for the EP, 76.8% (95% CI: 66.2% − 85.4%) and 3.1 (95% CI: 1.9–4.9) for the radiologist, 95.1% (95% CI: 87.9% − 98.6%) and 5.6 (95% CI: 1.8–17.3) for ChatGPT-4, and 93.9% (95% CI: 86.3% − 97.9%) and 9.0 (95% CI: 3.6–22.4) for ChatGPT4o. The area under the receiver operating characteristic curve (AUC) was 0.72 (95% CI: 0.6–0.8) for the EP, 0.61(95% CI: 0.4–0.7) for ChatGPT-4, 0.74 (95% CI: 0.6–0.8) for ChatGPT4-o, and 0.75 (95% CI: 0.6–0.8) for the radiologist. The EP and radiologist significantly outperformed ChatGPT-4 (P value = 0.02 and 0.01, respectively), whereas there was no significant difference between the EP, ChatGPT-4o, and radiologist. Conclusion This study showed that ChatGPT-4o had the potential to significantly impact medical imaging diagnosis.

Autoren

Institutionen

Tehran University of Medical Sciences(IR)

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingUltrasound in Clinical Applications

Volltext beim Verlag öffnen

Diagnostic performance of ChatGPT in tibial plateau fracture in knee X-ray

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen