OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 26.03.2026, 00:45

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance of ChatGPT-4o in Real-Time Medical Consultation for Retroperitoneal Fibrosis Patients Under Doctor Supervision: A Cross-Sectional Study in a Chinese Clinical Setting (Preprint)

2024·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

14

Autoren

2024

Jahr

Abstract

<sec> <title>BACKGROUND</title> LLMs like GPT-4 show promise in medical consultations but face challenges in non-English or real-time contexts. The new GPT-4o, with improved text processing and faster responses, may better address rare diseases like retroperitoneal fibrosis (RPF). </sec> <sec> <title>OBJECTIVE</title> Performance of GPT-4o in providing real-time medical consultations for patients with rare disease remains underexplored, which is generally a challenge in clinical practice. We evaluate the competency of GPT-4o to generate responses to a rare autoimmune RPF on accuracy, completeness, readability, and quality, using a 7-point Likert scale. </sec> <sec> <title>METHODS</title> A total of 103 real-world RPF patients queries were collected from diverse sources. Responses were generated using the newly released version of GPT-4o (2024/5/17). All questions were also stratified and randomly divided into six groups. Six attending rheumatologists were assigned to answer one set of questions, then generated new responses with assistance of GPT-4o. All the responses were assessed blindly by three experts in RPF. </sec> <sec> <title>RESULTS</title> GPT-4o scored significantly higher than rheumatologists in accuracy (6.39 ± 0.50 vs. 4.99 ± 0.62), completeness (6.51 ± 0.44 vs. 4.55 ± 0.60), readability (6.45 ± 0.42 vs. 4.93 ± 0.59), and quality (6.42 ± 0.46 vs. 4.78 ± 0.55) (p &lt; 0.001). Competency of rheumatologists + GPT-4o was better than that of rheumatologists alone (accuracy: 6.13 ± 0.63, completeness: 5.99 ± 0.81, readability: 6.05 ± 0.67, quality: 6.01 ± 0.71. p &lt; 0.001), and physician revisions generally reduced the competency of GPT-4o. Subgroup analysis showed no significant difference on accuracy between GPT-4o and rheumatologists + GPT-4o in answering complex questions, but any type of revision lowered the competency of GPT-4o. </sec> <sec> <title>CONCLUSIONS</title> GPT-4o has the potential to provide real-time medical consultations for RPF in the Chinese clinical environment. </sec>

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationHealthcare cost, quality, practicesCOVID-19 diagnosis using AI
Volltext beim Verlag öffnen