Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatGPT-5 vs oral medicine experts for rank-based differential diagnosis of oral lesions: a prospective, biopsy-validated comparison
6
Zitationen
3
Autoren
2025
Jahr
Abstract
Accurate differential diagnosis of oral lesions is challenging. Large language models (LLMs) may support clinicians, but expert-validated evidence on ranked differential lists remains limited. This study aimed to compare ChatGPT-5 with ChatGPT-4o and an oral medicine expert for biopsy-confirmed oral lesions. In this prospective, paired accuracy study, 100 biopsy-confirmed cases with standardized vignettes and photographs were independently assessed to produce Top-5 ranked differentials. Accuracy at Top-1, Top-3, and Top-5 was benchmarked against histopathology; subgroup analyses considered lesion type and case difficulty. Agreement with the expert was evaluated using percent agreement, Cohen's κ, and AC1. Top-1 accuracies were 52% (ChatGPT-5), 59% (ChatGPT-4o), and 79% (expert; Cochran's Q, p < 0.001). At Top-3, accuracies were 72%, 77%, and 88%; at Top-5, 78%, 83%, and 91%. Inflammatory lesions showed significant Top-1 differences favoring the expert, whereas performance converged at broader ranks. Agreement with the expert improved with broader thresholds: ChatGPT-5 AC1 rose from 0.361 (Top-1) to 0.715 (Top-5), and ChatGPT-4o from 0.336 to 0.767, while κ remained in the fair range. ChatGPT-5 generated clinically useful ranked differentials approaching expert performance at Top-3/Top-5 but lagged at Top-1. Lesion type, particularly inflammatory, influenced accuracy, supporting supervised clinical use. Although large language models may assist in narrowing differential diagnoses, their role in oral medicine remains supportive rather than determinative. Human expertise remains indispensable, and integration into clinical workflows should be restricted to supervised settings until future iterations achieve parity with experts.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.