Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Clinical Relevance of Large Language Models in Endodontics: Diagnostic Appropriateness Based on 50 Simulated Case Scenarios

2025·1 Zitationen·Australian Endodontic JournalOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Large language models (LLMs) are increasingly used in healthcare, but their performance in endodontic decision-making remains unclear. This study aimed to compare six LLMs in terms of diagnostic appropriateness for endodontic treatment planning. Fifty clinical scenarios were developed and entered into six LLMs (ChatGPT-4o, ChatGPT-3.5, Claude 4, Copilot, DeepSeek-V3, Gemini 2.5). Two specialists scored responses as appropriate or inappropriate. Repeated measures ANOVA and chi-square tests were used for analysis. Claude showed the highest accuracy (76%), followed by DeepSeek and Gemini. ChatGPT-3.5 had the lowest (40%). Significant differences were found between models (p < 0.05). Performance was better on straightforward cases than on complex scenarios. LLMs vary widely in diagnostic accuracy for endodontic cases. While some models show promise, others may provide confidently incorrect recommendations. Caution and human oversight remain essential until domain-specific, fine-tuned models are developed.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationDental Research and COVID-19Dental Radiography and Imaging

Volltext beim Verlag öffnen

Clinical Relevance of Large Language Models in Endodontics: Diagnostic Appropriateness Based on 50 Simulated Case Scenarios

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen