Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Accuracy of Large Language Models in Answering Dental Examination Questions: A Systematic Review and Meta-Analysis
0
Zitationen
10
Autoren
2026
Jahr
Abstract
INTRODUCTION: Large language models (LLMs), including OpenAI's GPT family accessed via interfaces such as ChatGPT and Microsoft Copilot, as well as non-GPT systems such as Google Gemini, are increasingly applied in healthcare and dental education. However, the accuracy of these systems in specialized tasks such as answering dental examination questions remains unclear. METHODS: This systematic review and meta-analysis evaluated LLM performance in answering dental questions. Databases searched were PubMed, Embase, Scopus, and Web of Science. Data on question type and number, LLM versions, and accuracy rates were extracted. Pooled accuracy was estimated using a random-effects model; heterogeneity and publication bias were assessed. RESULTS: A total of 39 studies were included, with ChatGPT-4 being the most frequently evaluated model. The pooled accuracy for LLMs was 63.7% (95% CI: 60.3%-67.1%), with high heterogeneity (I² = 91.5%). Subgroup analysis revealed ChatGPT-4 and Copilot (a GPT-based interface) achieved the highest pooled accuracies (∼73% and ∼75%, respectively). Direct comparisons confirmed ChatGPT-4 significantly outperformed earlier versions and some competitor models. Sensitivity analyses supported the robustness of findings. CONCLUSION: LLMs demonstrate moderate accuracy in answering dental examination questions and are currently insufficient for autonomous clinical decision-making. When their limitations are explicitly recognized, however, these systems may serve as valuable adjuncts in dental education and examination preparation. Methodological strategies such as structured prompting and retrieval-augmented approaches warrant further investigation but were not the primary focus of the present analysis.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.628 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.176 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.880 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Autoren
Institutionen
- Istinye University(TR)
- UCLA Health(US)
- UCLA Medical Center(US)
- Islamic Azad University Medical Branch of Tehran(IR)
- Universal Scientific Education and Research Network(IR)
- Tehran University of Medical Sciences(IR)
- King George's Medical University(IN)
- University of Tehran(IR)
- Islamic Azad University Dental Branch of Tehran(IR)
- Charles University(CZ)
- Ludwig-Maximilians-Universität München(DE)
- Chulalongkorn University(TH)
- LMU Klinikum(DE)