OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 01.04.2026, 16:24

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Human versus artificial intelligence in oral pathology diagnosis: a comparative study of ChatGPT, Grok, and MANUS

2026·0 Zitationen·Scientific ReportsOpen Access
Volltext beim Verlag öffnen

0

Zitationen

3

Autoren

2026

Jahr

Abstract

Artificial intelligence (AI) integration in diagnostic medicine has advanced accuracy and efficiency, particularly in pathology. This study assessed the diagnostic performance of three large language models (LLMs)-ChatGPT (GPT-4-turbo), Grok (xAI), and MANUS-in interpreting histopathology slides of oral lesions. A comparative diagnostic study was conducted using 100 high-resolution slides representing diverse oral pathologies. Images were sourced from a validated textbook and reviewed by two board-certified oral pathologists who provided consensus diagnoses. Each slide was analysed twice by the three AI models using standardized prompts. Diagnostic accuracy, intra-model consistency, inter-model concordance, and agreement with human experts were evaluated using descriptive statistics, Cohen's kappa, McNemar's test, and chi-square analysis. All AI models demonstrated high diagnostic accuracy. In the second round, Grok achieved the highest accuracy (97%), followed by MANUS (96%) and ChatGPT (94%). ChatGPT showed the highest intra-model consistency (κ = 0.918), while MANUS and Grok displayed substantial agreement (κ = 0.790 and 0.740). Expert pathologists achieved 98% accuracy. Comparisons between AI models and human diagnoses showed moderate to substantial agreement, with MANUS most aligned with experts. Most misclassifications occurred in histologically ambiguous cases, with no significant differences between AI models. Multimodal LLMs demonstrated strong diagnostic capabilities, consistency, and alignment with expert reasoning in oral histopathology interpretation. Grok was the most accurate, ChatGPT the most consistent, and MANUS the most expert-aligned. These findings support AI integration into digital pathology for diagnostic support, education, and quality assurance, with further validation in clinical datasets recommended.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingCOVID-19 diagnosis using AI
Volltext beim Verlag öffnen