Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Radiological Differential Diagnoses Based on Cardiovascular and Thoracic Imaging Patterns: Perspectives of Four Large Language Models
30
Zitationen
5
Autoren
2023
Jahr
Abstract
<b>Background</b> Differential diagnosis in radiology is a critical aspect of clinical decision-making. Radiologists in the early stages may find difficulties in listing the differential diagnosis from image patterns. In this context, the emergence of large language models (LLMs) has introduced new opportunities as these models have the capacity to access and contextualize extensive information from text-based input. <b>Objective</b> The objective of this study was to explore the utility of four LLMs-ChatGPT3.5, Google Bard, Microsoft Bing, and Perplexity-in providing most important differential diagnoses of cardiovascular and thoracic imaging patterns. <b>Methods</b> We selected 15 unique cardiovascular ( <i>n</i> = 5) and thoracic ( <i>n</i> = 10) imaging patterns. We asked each model to generate top 5 most important differential diagnoses for every pattern. Concurrently, a panel of two cardiothoracic radiologists independently identified top 5 differentials for each case and came to consensus when discrepancies occurred. We checked the concordance and acceptance of LLM-generated differentials with the consensus differential diagnosis. Categorical variables were compared by binomial, chi-squared, or Fisher's exact test. <b>Results</b> A total of 15 cases with five differentials generated a total of 75 items to analyze. The highest level of concordance was observed for diagnoses provided by Perplexity (66.67%), followed by ChatGPT (65.33%) and Bing (62.67%). The lowest score was for Bard with 45.33% of concordance with expert consensus. The acceptance rate was highest for Perplexity (90.67%), followed by Bing (89.33%) and ChatGPT (85.33%). The lowest acceptance rate was for Bard (69.33%). <b>Conclusion</b> Four LLMs-ChatGPT3.5, Google Bard, Microsoft Bing, and Perplexity-generated differential diagnoses had high level of acceptance but relatively lower concordance. There were significant differences in acceptance and concordance among the LLMs. Hence, it is important to carefully select the suitable model for usage in patient care or in medical education.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.