Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Comparative diagnostic accuracy of ChatGPT large language models and expert clinicians in complex oral and maxillofacial diseases

2026·1 Zitationen·Scientific ReportsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Although Chat Generative Pre-training Transformer (ChatGPT) role in dentistry is being explored, its diagnostic reliability and performance in complex cases remain under-investigated. This study evaluates the ability of ChatGPT-4, 4.0, Mini 4.0, and 3.5 to interpret clinical, imaging, and histopathological data in oral and maxillofacial diseases and compares their diagnostic accuracy with experienced dentists. This study evaluated four ChatGPT versions (GPT-3.5, GPT-4o, GPT-4o Mini, and GPT-4) using 50 queries per model. Fifty complex oral and maxillofacial pathology cases were selected from a journal’s clinicopathological section, including clinical, radiographic, and histopathological data. Multiple-choice diagnostic questions were used to compare ChatGPT’s responses with correct answers and diagnoses from two experienced dentists. Statistical analyses were conducted using SPSS (Version 25.0, IBM), with p < 0.05 considered significant. ChatGPT-4o achieved the highest accuracy (70%), followed by ChatGPT-4 (66%), ChatGPT-3.5 (50%), and ChatGPT-4o Mini (46%), while dentists outperformed large language models (LLMs) with 72% accuracy. Cohen’s Kappa test showed the highest agreement among LLMs between ChatGPT-3.5 and ChatGPT-4o Mini (κ = 0.760), with significant agreement between ChatGPT-4o and ChatGPT-4 (κ = 0.520). Dentists had the highest agreement (κ = 0.901). Overall, LLMs models showed lower diagnostic accuracy than dentists, but ChatGPT-4o performed closest to expert-level accuracy. This study highlights the potential of advanced ChatGPT versions in assisting with oral lesion detection while also emphasizing the critical role of human expertise in diagnostic accuracy. Collaboration between LLMs and dental professionals can enhance diagnostic precision and improve patient outcomes.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Comparative diagnostic accuracy of ChatGPT large language models and expert clinicians in complex oral and maxillofacial diseases

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen