OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.03.2026, 16:40

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluation of the Competency of Large Language Models GPT-4o and Claude 3.5 Sonnet in Endodontic Emergencies

2025·0 Zitationen·European Annals of Dental SciencesOpen Access
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2025

Jahr

Abstract

Purpose: This study aimed to evaluate the accuracy and comprehensiveness of the responses generated by GPT-4o and Claude-3.5 Sonnet to the most frequently asked questions about endodontic emergencies. Materials and Methods: The most frequently asked questions about nine different topics (inferior alveolar nerve block, sodium hypochlorite accidents, aspiration of dental materials, separated instruments, perforation, transportation, Ca(OH)2 extrusion, root filling, and flare-up) in endodontics were generated by GPT 3.5. Each question was asked to the both GPT-4o and Claude 3.5 Sonnet. Two authors independently scored the responses. Accuracy and comprehensiveness were assessed for each question using Likert scales. The data were statistically analyzed using the Mann‒Whitney U test, the Kruskal‒Wallis test. Significance level was set at 0.05. Results: Responses generated by both GPT-4o and Claude 3.5 Sonnet to a total of 81 open-ended questions were evaluated. The two models yielded similar results in terms of accuracy and comprehensiveness (p > 0.05). The topics of root filling, perforation, and flare-up have the lowest accuracy scores; and root filling and separated instruments have the lowest comprehensiveness scores for GPT-4o (p < 0.05). The accuracy of Claude 3.5's responses did not show significant differences between the topics (p > 0.05); however, separated instruments had the lowest comprehensiveness scores (p < 0.05). Conclusion: The accuracy and comprehensiveness scores of GPT-4 and Claude 3.5 Sonnet are statistically similar. Despite the high levels of accuracy and comprehensiveness shown by GPT-4o and Claude 3.5 Sonnet, they do not yet have the effect of replacing the operator in endodontic procedures.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationDental Radiography and ImagingReliability and Agreement in Measurement
Volltext beim Verlag öffnen