Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

ChatGPT-5 vs oral medicine experts for rank-based differential diagnosis of oral lesions: a prospective, biopsy-validated comparison

2025·8 Zitationen·OdontologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Accurate differential diagnosis of oral lesions is challenging. Large language models (LLMs) may support clinicians, but expert-validated evidence on ranked differential lists remains limited. This study aimed to compare ChatGPT-5 with ChatGPT-4o and an oral medicine expert for biopsy-confirmed oral lesions. In this prospective, paired accuracy study, 100 biopsy-confirmed cases with standardized vignettes and photographs were independently assessed to produce Top-5 ranked differentials. Accuracy at Top-1, Top-3, and Top-5 was benchmarked against histopathology; subgroup analyses considered lesion type and case difficulty. Agreement with the expert was evaluated using percent agreement, Cohen's κ, and AC1. Top-1 accuracies were 52% (ChatGPT-5), 59% (ChatGPT-4o), and 79% (expert; Cochran's Q, p < 0.001). At Top-3, accuracies were 72%, 77%, and 88%; at Top-5, 78%, 83%, and 91%. Inflammatory lesions showed significant Top-1 differences favoring the expert, whereas performance converged at broader ranks. Agreement with the expert improved with broader thresholds: ChatGPT-5 AC1 rose from 0.361 (Top-1) to 0.715 (Top-5), and ChatGPT-4o from 0.336 to 0.767, while κ remained in the fair range. ChatGPT-5 generated clinically useful ranked differentials approaching expert performance at Top-3/Top-5 but lagged at Top-1. Lesion type, particularly inflammatory, influenced accuracy, supporting supervised clinical use. Although large language models may assist in narrowing differential diagnoses, their role in oral medicine remains supportive rather than determinative. Human expertise remains indispensable, and integration into clinical workflows should be restricted to supervised settings until future iterations achieve parity with experts.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingDental Radiography and Imaging

Volltext beim Verlag öffnen

ChatGPT-5 vs oral medicine experts for rank-based differential diagnosis of oral lesions: a prospective, biopsy-validated comparison

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen