Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative performance of ChatGPT, Gemini, and DeepSeek on endodontic exam questions in Turkish and English
2
Zitationen
1
Autoren
2025
Jahr
Abstract
<title>Abstract</title> Objective This study aimed to compare the performance of ChatGPT-4, Gemini 2.0, and DeepSeek-R1 in answering dentistry specialty exam (DUS) endodontics questions in Turkish and English. Methods A total of 130 multiple-choice endodontics questions from the DUS question pool were presented to ChatGPT-4 (OpenAI), Gemini 2.0 (Google), and DeepSeek-R1 under standardized conditions in both languages. Responses were categorized as “correct answer with correct explanation,” “correct answer with incorrect explanation,” and “incorrect.” Statistical analysis was performed using R, applying McNemar’s Chi-squared test and Fisher’s Exact Test (significance level: p < 0.05). Results All models performed better in English than in Turkish. In Turkish, DeepSeek-R1 and Gemini 2.0 significantly outperformed ChatGPT-4. Simple-style questions were answered more accurately than combination-style questions by all models in both languages. Conclusions LLMs show potential in standardized dental exams but still face challenges in fully grasping conceptual knowledge and may generate hallucinations. Continuous development is needed to improve their accuracy across languages and subject areas..
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.292 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.143 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.539 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.452 Zit.