Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Comparative Study: Diagnostic Performance of ChatGPT 3.5, Google Bard, Microsoft Bing, and Radiologists in Thoracic Radiology Cases

2024·6 Zitationen·medRxivOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

ABSTRACT Purpose To investigate and compare the diagnostic performance of ChatGPT 3.5, Google Bard, Microsoft Bing, and two board-certified radiologists in thoracic radiology cases published by The Society of Thoracic Radiology. Materials and Methods We collected 124 “Case of the Month” from the Society of Thoracic Radiology website between March 2012 and December 2023. Medical history and imaging findings were input into ChatGPT 3.5, Google Bard, and Microsoft Bing for diagnosis and differential diagnosis. Two board-certified radiologists provided their diagnoses. Cases were categorized anatomically (parenchyma, airways, mediastinum-pleura-chest wall, and vascular) and further classified as specific or non-specific for radiological diagnosis. Diagnostic accuracy and differential diagnosis scores were analyzed using chi-square, Kruskal-Wallis and Mann-Whitney U tests. Results Among 124 cases, ChatGPT demonstrated the highest diagnostic accuracy (53.2%), outperforming radiologists (52.4% and 41.1%), Bard (33.1%), and Bing (29.8%). Specific cases revealed varying diagnostic accuracies, with Radiologist I achieving (65.6%), surpassing ChatGPT (63.5%), Radiologist II (52.0%), Bard (39.5%), and Bing (35.4%). ChatGPT 3.5 and Bing had higher differential scores in specific cases (P<0.05), whereas Bard did not (P=0.114). All three had a higher diagnostic accuracy in specific cases (P<0.05). No differences were found in the diagnostic accuracy or differential diagnosis scores of the four anatomical location (P>0.05). Conclusion ChatGPT 3.5 demonstrated higher diagnostic accuracy than Bing, Bard and radiologists in text-based thoracic radiology cases. Large language models hold great promise in this field under proper medical supervision.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingRadiology practices and education

Volltext beim Verlag öffnen

A Comparative Study: Diagnostic Performance of ChatGPT 3.5, Google Bard, Microsoft Bing, and Radiologists in Thoracic Radiology Cases

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen