Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative performance of AI chatbots in dental implantology: insights and limitations
0
Zitationen
3
Autoren
2025
Jahr
Abstract
OBJECTIVE: This study critically evaluated the performance, accuracy, and clinical relevance of three large language models ChatGPT-4o, Claude 3.5, and Gemini 1.5 Pro when answering expert-generated questions on zygomatic implantology. The goal was to determine the extent to which such tools may function as educational or clinical decision supports in maxillofacial surgery. METHODS: Thirty-eight standardized questions were developed by four oral and maxillofacial surgeons with advanced expertise in zygomatic implantology. Each model's responses were independently assessed by five calibrated clinical raters using validated metrics DISCERN, GQS, and a 5-point Accuracy Rubric to judge reliability, quality, and factual correctness. Non-parametric statistics (Kruskal-Wallis with Bonferroni post hoc correction; Spearman correlation) were used, and inter-rater reliability was quantified by ICC(2,1) = 0.86-0.91 (p < 0.001). RESULTS: Gemini 1.5 Pro achieved slightly higher mean scores for response quality and accuracy, whereas Claude 3.5 and ChatGPT-4o performed comparably. However, absolute differences were modest (≤ 0.5 points on 5-point scales), indicating relative trends rather than decisive superiority. All models produced readable, clinically relevant content, though variability persisted in the depth and specificity of clinical guidance. CONCLUSION: Current AI language models exhibit moderate but inconsistent competency when addressing complex implantology scenarios. While Gemini 1.5 Pro scored marginally higher, these differences are unlikely to be of major practical consequence. Continuous validation, transparent reporting of model versions, and expert supervision remain essential before integrating such systems into routine dental education or clinical decision-making.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.560 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.451 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.948 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.797 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.