Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Abstract DP042: Explainable natural language processing (NLP) models to predict 90-day mortality of different stroke types from clinical note

2026·0 Zitationen·Stroke

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Purpose: Free-text clinical notes contain rich prognostic information often lost in traditional models limited to structured variables (e.g., age, sex, NIHSS). Deep learning–based natural language processing (NLP) can leverage this information without manual variable extraction and may uncover risk factors beyond established predictors. We evaluated different NLP strategies for predicting 90-day mortality from ICU notes across multiple stroke types and examined model explainability using SHapley Additive exPlanations (SHAP) value quantification. Methods: We used the Medical Information Mart for Intensive Care (MIMIC-IV) database (>40,000 ICU patients; Beth Israel Deaconess Medical Center, 2008–2019) and identified 7,511 patients with acute ischemic stroke, spontaneous intracerebral hemorrhage (ICH), non-traumatic subarachnoid hemorrhage (SAH), and traumatic SAH. We compared four NLP strategies using transformer-based models designed to process varying text lengths: (1) BioBERT with full free-text notes (512 tokens), (2) BioBERT with keyword-focused 512-token summaries, (3) Longformer (4,096 tokens), and (4) dual-stream BioBERT combining full notes and summaries. Separate models were trained for each stroke subtype. SHAP quantified the contribution of individual text tokens to patient-level risk predictions. Results: Longformer and dual-stream BioBERT consistently outperformed other approaches. In 5-fold cross-validation, best-performing models achieved mean AUCs of 0.83±0.06 (ischemic stroke, n=1,819), 0.81±0.08 (ICH, n=1,657), 0.85±0.04 (non-traumatic SAH, n=877), and 0.82±0.02 (traumatic SAH, n=3,158), summarized in Table 1. SHAP identified high-impact tokens from free text such as “intubation,” “transfer,” specific comorbidities, and medications, many of which extend beyond known prognostic variables (Fig 1). Conclusion: Transformer-based NLP models, particularly those handling longer text sequences or combining full and focused inputs, can accurately predict 90-day mortality from free-text ICU notes across stroke types. SHAP explainability highlights novel high-risk features, suggesting a potential role for automated, real-time risk stratification directly from the electronic health record to guide early intervention. Such free text-based deep learning models can accelerate risk stratification model development by bypassing variable extraction and potentially identifying risk factors beyond known predictors.

Autoren

Institutionen

Themen

Intracerebral and Subarachnoid Hemorrhage ResearchMachine Learning in HealthcareArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Abstract DP042: Explainable natural language processing (NLP) models to predict 90-day mortality of different stroke types from clinical note

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen