Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Comparison of ChatGPT and Claude in Managing Real-Life Difficult Nephrology Cases

2026·0 Zitationen·Archives of Medical ScienceOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Introduction Artificial intelligence (AI) based large language models (LLMs) are promising tools for clinical decision support, but their reliability in specialized fields like nephrology is still uncertain. ChatGPT and Claude represent distinct AI architectures with potentially different clinical utilities. We aimed to compare the diagnostic accuracy, treatment recommendations, and overall clinical utility of these two AI models in managing real life difficult nephrology cases. Material and methods Twenty-two real nephrology cases from a tertiary care university hospital were presented to both models, covering disorders such as glomerulonephritis, acute kidney injury, vasculitis, and transplant complications. Each model’s output was assessed for diagnostic accuracy, risk evaluation, test recommendations, and treatment planning. Three independent nephrologists evaluated the responses using the Quality Assessment of Medical Information (QAMAI) and Global Quality Score (GQS) tools. Statistical comparisons were performed using the Wilcoxon signed-rank test, with p<0.05 considered significant. Results Claude achieved higher diagnostic accuracy than ChatGPT (4.59 ± 0.41 vs. 4.36 ± 0.48; p=0.048), whereas ChatGPT scored better in clarity (4.63 ± 0.30 vs. 4.32 ± 0.29; p=0.002). No significant differences were found in relevance, completeness, usefulness, or source citation. Overall QAMAI scores were comparable between the two models (ChatGPT: 23.72 ± 1.46; Claude: 23.39 ± 1.43; p=0.371). Inter-rater reliability ranged from moderate to good, with the highest agreement observed for ChatGPT’s GQS. Conclusions Both ChatGPT and Claude demonstrate notable potential as decision-support tools in nephrology. Claude provided slightly higher diagnostic accuracy, while ChatGPT offered greater clarity. Despite these promising results, clinical judgment remains essential when interpreting LLM-generated suggestions.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsMachine Learning in Healthcare

Volltext beim Verlag öffnen

Comparison of ChatGPT and Claude in Managing Real-Life Difficult Nephrology Cases

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen