Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Abstract 4366555: Artificial Intelligence to Extract Structured Details from Unstructured Medical Records in a Global Heart Failure Trial
0
Zitationen
14
Autoren
2025
Jahr
Abstract
Background: Global clinical trials collect extensive unstructured medical records that richly describe participants’ clinical presentation, but their narrative format precludes quantitative analysis. Converting these records into structured data could reveal new insights into events like heart failure hospitalization (HFH) but is prohibitively labor intensive. Varied documentation styles, translated text, and scanned or handwritten documents challenge data extraction in clinical trials. Research Question: Can a carefully prompted large language model (LLM) accurately extract structured data on the clinical presentation of HFHs from unstructured adjudication dossiers in a global trial? Methods: We extracted structured data for 51 variables—symptoms, signs, laboratory values, imaging findings, and treatments—from unstructured medical record dossiers in the DELIVER trial utilizing a prompting workflow for the OpenAI o1-mini LLM. We validated LLM-extracted data with physician chart review in a random sample of 125 dossiers and calculated the accuracy, positive predictive value (PPV), and negative predictive value (NPV). A second physician reviewed 25 dossiers to assess inter-reviewer reproducibility. The validated model was applied to extract presenting features in all 1,111 HFHs in the trial. Results: LLM-extracted data achieved high overall accuracy of 0.96 (95% CI: 0.96–0.97), PPV of 0.94 (95% CI: 0.93–0.95), and NPV of 0.97 (95% CI: 0.97–0.98) on physician validation. Accuracy exceeded 0.90 for all variables except numeric BNP and Troponin I, where errors mainly reflected confusion between biomarker subtypes (BNP vs NT-proBNP or troponin I vs T). The reproducibility of human review was 98%. In the full trial, dyspnea (92%), peripheral edema (68%), pulmonary crackles (51%), congestion on chest imaging (59%) and natriuretic peptide elevation (65%) were reported frequently. Mean peri-admission ejection fraction was 46% ± 12% (n=481 with available data), vs a baseline mean of 53% ± 9%. Intravenous diuretics were used in 85% of hospitalizations, and oral diuretic doses were increased in 45%. Conclusion: A prompt-engineered LLM accurately extracted structured data, including signs and symptoms, laboratory and imaging data from adjudication dossiers at scale in a global clinical trial to generate a structured useable database. LLM-based data extraction could be extended to unlock quantitative insights from a wide range of narrative trial records.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.198 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.576 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.382 Zit.