OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 17.03.2026, 20:30

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases

2025·0 Zitationen·DiagnosticsOpen Access
Volltext beim Verlag öffnen

0

Zitationen

9

Autoren

2025

Jahr

Abstract

<b>Background/Objective:</b> In recent years, artificial intelligence (AI) has gained increasing prominence in the fields of diagnostic decision-making in medicine. The aim of this study was to compare multidisciplinary team (MDT: rheumatology, pulmonology, thoracic radiology) decisions with single-session plans generated by ChatGPT-4o. <b>Methods:</b> In this cross-sectional concordance study, adults (≥18 years) with confirmed systemic autoimmune rheumatic disease (SARD) and having MDT decisions within the last 6 months were included. The study documented diagnostic, treatment, and monitoring decisions in cases of SARDs by recording answers to six essential questions: (1) What is the most likely clinical diagnosis? (2) What is the most likely radiological diagnosis? (3) Is there a need for anti-inflammatory treatment? (4) Is there a need for antifibrotic treatment? (5) Is drug-free follow-up appropriate? and (6) Are additional investigations required? Consequently, all evaluations were performed with ChatGPT-4o in a single-session format using a standardized single-prompt template, with the system blinded to MDT decisions. All data analyses in this study were conducted using the R programming language (version 4.3.2). An agreement between AI-generated and MDT decisions was assessed using Cohen's Kappa (κ) statistic where κ (kappa) values represent the level of agreement: <0.20 = slight, 0.21-0.40 = fair, 0.41-0.60 = moderate, 0.61-0.80 = substantial, >0.80 = almost perfect agreement. These analyses were performed using the irr and psych packages in R. Statistical significance of the models was evaluated through <i>p</i>-values, while overall model fit was assessed using the Likelihood Ratio Test. <b>Results:</b> A total of 47 patients were involved in this study, with a predominance of female patients (61.70%, <i>n</i> = 29). The mean age was 61.74 ± 10.40 years. The most frequently observed diagnosis was rheumatoid arthritis (RA), accounting for 31.91% of cases (<i>n</i> = 15). This was followed by cases of anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitis, interstitial pneumonia with autoimmune features (IPAF), and sarcoidosis. The analyses indicate a statistically significant level of agreement across all decision types. For clinical diagnosis decisions, agreement was moderate (κ = 0.52), suggesting that the AI system can reach partially consistent conclusions in diagnostic processes. The need for an immunosuppressive treatment and follow-up without medication decisions demonstrated a higher level of concordance, reaching the moderate-to-high range (κ = 0.64 and κ = 0.67, respectively). For antifibrotic treatment decisions, agreement was moderate (κ = 0.49), while radiological diagnosis decisions also fell within the moderate range (κ = 0.55). The lowest agreement-though still moderate-was observed in further investigation required decisions (κ = 0.45). <b>Conclusions:</b> In patients with SARDs with pulmonary involvement, particularly in complex cases, concordance was observed between MDT decisions and AI-generated recommendations regarding prioritization of clinical and radiologic diagnoses, treatment selection, suitability for drug-free follow-up, and the need for further diagnostic investigations.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsRadiomics and Machine Learning in Medical Imaging
Volltext beim Verlag öffnen