Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance of large language models on prosthodontics questions of the dentistry specialization examination: a comparative analysis (2014–2024)

2025·2 Zitationen·Anatolian Current Medical JournalOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Aims: This study aimed to comparatively evaluate the performance of five contemporary large language models (LLMs) on prosthodontics questions of the dentistry specialization examination (DUS) between 2014 and 2024. Methods: A total of 167 prosthodontics questions from the DUS were analyzed. The questions were administered to five different LLMs: ChatGPT-5 (OpenAI Inc., USA), Claude 4 (Anthropic, USA), Gemini 1.5 Pro (Google LLC, USA), DeepSeek-V2 (DeepSeek AI, China), and Perplexity Pro (Perplexity AI, USA). The models’ responses were compared with the official answer keys provided by the Student Selection and Placement Center (OSYM), coded as correct or incorrect, and accuracy percentages were calculated. Statistical analyses included the Friedman test, correlation analysis, and frequency distributions. Subsection analyses were also performed to evaluate model performance across different content areas. Results: DeepSeek-V2 achieved the highest overall accuracy rate (70.06%). Perplexity Pro (53.89%) and Gemini 1.5 Pro (51.50%) demonstrated moderate performance, ChatGPT-5 (49.10%) performed close to human levels, while Claude 4 had the lowest accuracy (32.34%). Subsection analyses revealed high accuracy in standardized knowledge areas such as implantology and temporomandibular joint (TMJ) disorders (66.7-100%), whereas notable decreases were observed in occlusion and morphology questions (9.1-53.9%). Correlation analyses indicated significant relationships between certain models. Conclusion: The findings demonstrate heterogeneous performance of LLMs on DUS prosthodontics questions. While these models may serve as supplementary tools for exam preparation and dental education, their variable accuracy and potential for generating misinformation suggest they should not be used independently. Under expert supervision, LLMs may enhance dental education.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationDental Research and COVID-19Academic integrity and plagiarism

Volltext beim Verlag öffnen

Performance of large language models on prosthodontics questions of the dentistry specialization examination: a comparative analysis (2014–2024)

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen