Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Diagnostic Accuracy of a Large Language Model (ChatGPT-4) for Patients Admitted to a Community Hospital Medical Intensive Care Unit: A Retrospective Case Study

2025·1 Zitationen·Journal of Intensive Care Medicine

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BackgroundThe future of artificial intelligence in medicine includes the use of machine learning and large language models to improve diagnostic accuracy, as a point-of-care tool, at the time of admission to an acute care hospital. The large language model, ChatGPT-4, has been shown to diagnose complex medical conditions with accuracies comparable to experienced clinicians, however, most published studies involved curated cases or examination-like questions and are not point-of-care. To test the hypothesis that ChatGPT-4 can make an accurate medical diagnosis using real-world medical cases and a convenient cut and paste strategy, we performed a retrospective case study involving critically ill patients admitted to a community hospital medical intensive care unit.MethodsA redacted H&P was essentially cut and pasted into ChatGPT-4 with uniform instructions to make a leading diagnosis and a list of 5 possibilities as a differential diagnosis. All features that could be used to identify patients were removed to ensure privacy and HIPAA compliance. The ChatGPT-4 diagnoses were compared with critical care physician diagnoses using a blinded longitudinal chart review as the ground truth diagnosis.ResultsA total of 120 randomly selected cases were included in the study. The diagnostic accuracy was 88.3% for physicians and 85.0% for ChatGPT-4, with no significant difference by McNemar testing (p-value of 0.249). The agreement between physician diagnosis and ChatGPT-4 diagnosis was moderate, 0.57 (95% CI: 0.35-0.79), based on Cohen's kappa statistic.ConclusionThese results suggest that ChatGTP-4 achieved diagnostic accuracy comparable to board certified physicians in the context of critically ill patients admitted to a community medical intensive care unit. Furthermore, the agreement was only moderate, suggesting that there may be complementary ways of combining the diagnostic acumen of physicians and ChatGPT-4 to improve overall accuracy. A prospective study would be necessary to determine if ChatGPT-4 could improve patient outcomes as a point-of-care tool at the time of admission.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsSepsis Diagnosis and Treatment

Volltext beim Verlag öffnen

Diagnostic Accuracy of a Large Language Model (ChatGPT-4) for Patients Admitted to a Community Hospital Medical Intensive Care Unit: A Retrospective Case Study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen