OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.03.2026, 08:26

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Alignment of ChatGPT with Expert Opinion in Nephrology Polls

2024·0 Zitationen·Journal of the American Society of Nephrology
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2024

Jahr

Abstract

Background: Healthcare professionals often face complex clinical scenarios that do not have straightforward solutions, necessitating professional collaboration. This is common in nephrology, where soliciting peer insight is crucial for informed decision making. ChatGPT, a sophisticated language model, has demonstrated its problem-solving utility in several fields. However, its alignment with prevailing medical opinions in the context of intricate clinical scenarios remains unexplored. This study seeks to evaluate how closely ChatGPT's responses align with the nephrology community’s prevailing opinions by comparing responses to real-world clinical questions. Methods: Nephrology polls were collected from the social media site X using the hashtag #AskRenal, resulting in 271 questions. These were presented to ChatGPT-4, which generated answers without prior knowledge of the poll outcomes. This was repeated one week later using the same questions presented in a randomized order to assess internal consistency. The responses given by ChatGPT-4 during the two rounds of inquiry were compared to the poll results (inter-rater) and between each other (intra-rater) using Cohen's kappa statistic (κ). The questions were also grouped into seven categories based on subject matter, and subgroup analysis was performed. Results: 60.2% of responses from ChatGPT matched the poll results in the first round of inquiry (κ=0.42) and 63.1% matched in the second (κ=0.46). The two rounds had an internal agreement rate of 90.4% (κ=0.86). The included table presents subgroup data. Conclusion: ChatGPT-4 demonstrates moderate capability in replicating prevailing professional opinion in nephrology polls, with varying performance levels between question categories and high internal consistency. While AI-based language models have potential to assist with decision making in complex clinical scenarios, their reliability has yet to be fully proven and they should be integrated cautiously. Agreement by question category - Round 1 Round 2 Intra-rater CKD, ESRD, dialysis, & transplant 62% (κ=0.4) 64% (κ=0.5) 90% (κ=0.9) Electrolyte & acid-base disorders 62% (κ=0.5) 54% (κ=0.4) 92% (κ=0.9) Glomerular disease, AKI, & critical care 51% (κ=0.3) 58% (κ=0.4) 87% (κ=0.8) Mineral, bone, & stone diseases 78% (κ=0.7) 89% (κ=0.8) 89% (κ=0.8) Pharmacology 65% (κ=0.5) 65% (κ=0.5) 100% (κ=1.0) Tubular, interstitial, & cystic disorders 50% (κ=0.1) 25% (κ=-0.1) 75% (κ=0.6) Other 73% (κ=0.6) 80% (κ=0.7) 93% (κ=0.9)

Ähnliche Arbeiten