OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 27.03.2026, 07:01

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Multimodal GPT-5 for Predicting Poor Functional Outcomes After Intracerebral Hemorrhage in the Emergency Department: Validation Study (Preprint)

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

11

Autoren

2025

Jahr

Abstract

<sec> <title>BACKGROUND</title> In the emergency department (ED), rapid prognostic assessment of patients with intracerebral hemorrhage (ICH) is essential for guiding treatment, even when stroke specialists are unavailable. Recent advances in large language models have triggered the increased application of machine learning (ML) models in medical contexts. </sec> <sec> <title>OBJECTIVE</title> To evaluate the predictive performance of GPT-based models for poor functional outcomes after ICH using real-world multimodal data routinely available at ED presentation. </sec> <sec> <title>METHODS</title> The data of patients with ICH admitted to a tertiary hospital were analyzed. Using routinely collected clinical data and noncontrast computed tomography (CT) images at admission, GPT-4.1 and GPT-5—accessed via Azure OpenAI Service—were applied to predict poor functional outcomes, defined as a modified Rankin Scale score of 3–6 at discharge. A conventional ML model was developed by combining deep learning-extracted features from Digital Imaging and Communications in Medicine CT data with clinical variables using L1-regularized logistic regression. GPT models were evaluated using the same clinical dataset and JPEG-format CT images. Model performance was assessed through discrimination (area under the receiver operating characteristic curve [AUROC]), calibration, reproducibility (intraclass correlation coefficient [ICC]), and clinical utility (decision curve analysis [DCA]). </sec> <sec> <title>RESULTS</title> The ML model achieved an AUROC of 0.85 (95% confidence interval, 0.79–0.90). Zero-shot GPT-4.1 and GPT-5 demonstrated strong discrimination (AUROC 0.83 and 0.86, respectively) with high reproducibility (ICC 0.91 and 0.95, respectively). Incorporating ML-derived information into model-informed prompts increased the AUROC to 0.85 and 0.87, respectively, with reproducibility remaining high (ICC 0.97 and 0.96, respectively). Calibration plots indicated that GPT models tended to underestimate probabilities; however, this bias improved after model-informed prompting. DCA showed a higher net benefit when ML-derived information was incorporated. </sec> <sec> <title>CONCLUSIONS</title> Zero-shot GPT models, particularly GPT-5, achieved predictive performance comparable to or exceeding that of conventional ML models using routinely available clinical data and CT images. Incorporating ML-derived outputs into GPT prompts further improved clinical utility, suggesting potential value for real-time decision support in emergency care. </sec>

Ähnliche Arbeiten

Autoren

Themen

Intracerebral and Subarachnoid Hemorrhage ResearchArtificial Intelligence in Healthcare and EducationAcute Ischemic Stroke Management
Volltext beim Verlag öffnen