Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Expert evaluation of GPT-4o and Gemini responses to patient questions on carotid endarterectomy
0
Zitationen
7
Autoren
2026
Jahr
Abstract
OBJECTIVE: The aim of this study was to compare the accuracy, scientific quality, and clarity of responses generated by GPT-4o and Gemini to frequently asked patient questions related to carotid artery disease and carotid endarterectomy. METHODS: In total, 40 unique carotid endarterectomy-related questions were compiled from online sources and clinical experience. Each was entered into separate new sessions with GPT-4o and Gemini 2.5 Flash in Turkish, and responses were collected without modification. Notably, four blinded cardiovascular surgeons independently rated each answer (1-5 Likert scale) in three domains: Accuracy, Scientific Quality, and Clarity. Mean response lengths and domain scores were compared using appropriate paired tests. RESULTS: GPT-4o produced longer responses than Gemini (258.1±101.6 vs. 193.2±43.7 words; p<0.001). Overall, GPT-4o had higher Accuracy scores (4.33±0.39 vs. 4.16±0.33; p=0.04), with no significant differences in Scientific Quality or Clarity (p=0.377 and p=0.154, respectively). In rater-level analyses, Gemini scored higher in Clarity for one rater, whereas GPT-4o was superior in Accuracy and Scientific Quality for another. Overall mean scores were comparable (4.17±0.36 vs. 4.13±0.31; p=0.636). Physician referral was recommended in 62.5% of GPT-4o and 52.5% of Gemini (p=0.366). CONCLUSION: Both GPT-4o and Gemini provided "good"-quality responses to carotid endarterectomy patient questions, with GPT-4o showing a modest accuracy advantage, with no difference in other domains. Explicit disclaimers on both platforms underscore their supportive, not definitive, role in patient education. Physicians should remain the primary source for individualized decisions, and AI-generated information should always be verified.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.644 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.550 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.061 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.850 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.