Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating Artificial Intelligence in Patient Education: <scp>DeepSeek</scp> ‐ <scp>V3</scp> Versus <scp>ChatGPT</scp> ‐4o in Answering Common Questions on Laparoscopic Cholecystectomy
15
Zitationen
2
Autoren
2025
Jahr
Abstract
BACKGROUND: Artificial intelligence-based large language models (AI-based LLMs) have gained popularity over traditional search engines for obtaining medical information. However, the accuracy and reliability of these AI-generated medical insights remain a topic of debate. Recently, a new AI-based LLM, DeepSeek-V3, developed in East Asia, has been introduced. The aim of this study is to evaluate the appropriateness, accuracy, and readability of responses and the usability of these answers for patient education provided by ChatGPT-4o and DeepSeek-V3 AI-based LLMs to frequently asked questions by patients regarding laparoscopic cholecystectomy (LC). METHODS: The 20 most frequently asked questions by patients regarding LC were presented to the DeepSeek-V3 and ChatGPT-4o chatbots. Before each question, the search history was deleted. The comprehensiveness of the responses was evaluated based on clinical experience by two board-certified general surgeons experienced in hepatobiliary surgery using a Likert scale. Paired sample t-test and Wilcoxon signed rank test were used. Inter-rater reliability was analyzed with Cohen's Kappa test. RESULTS: The DeepSeek-V3 chatbot provided statistically significantly more suitable responses compared to ChatGPT-4o (p = 0.033). On the Likert scale, DeepSeek-V3 received a 5-point rating for 19 out of 20 questions (95%), whereas ChatGPT-4o achieved a 5-point rating for only 13 questions (65%). Based on the evaluation conducted according to the reviewers' clinical experience, DeepSeek-V3 provided statistically significantly more appropriate responses (p = 0.008). CONCLUSION: Released in January 2025, DeepSeek-V3 provides more suitable responses to patient inquiries regarding LC compared to ChatGPT-4o.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.774 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.685 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.244 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.