Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Development and External Validation of Multimodal Machine Learning Models to Predict High Inpatient Opioid Exposure
0
Zitationen
5
Autoren
2026
Jahr
Abstract
ABSTRACT High inpatient opioid exposure is associated with increased risk of persistent opioid use, yet early identification of high-risk patients during hospitalization remains limited. We developed and evaluated machine learning models to predict extreme opioid exposure using electronic health record data from MIMIC-IV. This retrospective cohort-based prediction modeling study included 223,452 unique first hospital admissions. The outcome was extreme opioid exposure, defined as the top decile of morphine milligram equivalents (MME) per day among opioid-exposed admissions (corresponding to ≥225 MME/day in the development cohort), representing 2.65% of all admissions. Structured early-admission features included demographics, admission characteristics, laboratory utilization and abnormality summaries, and 24-hour procedural indicators. Discharge-note data were incorporated using ClinicalBERT embeddings and bigram features. Models were trained using an 80/10/10 split, with temporal validation performed on the most recent 10% of admissions. External validation was conducted using the MIMIC-III and eICU Collaborative Research Database cohorts. Performance was assessed using ROC-AUC and PR-AUC with 95% confidence intervals. Among structured-only models, XGBoost achieved the best internal test performance (ROC-AUC 0.932 [0.924-0.940]; PR-AUC 0.223 [0.193-0.262]). A combined structured and notes model improved precision-recall performance (ROC-AUC 0.932 [0.920-0.943]; PR-AUC 0.276 [0.229-0.331]). Temporal validation showed similar discrimination (ROC-AUC 0.929; PR-AUC 0.223). In external validation, performance decreased substantially. In MIMIC-III, the model achieved ROC-AUC 0.669 [0.659-0.680] and PR-AUC 0.018 [0.017-0.019], while in eICU performance was further attenuated (ROC-AUC 0.567 [0.556-0.576]; PR-AUC 0.018 [0.017-0.019]). Predicted probabilities were poorly calibrated in both external datasets, with limited correspondence between predicted and observed risk. These findings demonstrate that while EHR-based machine learning models can achieve strong internal discrimination, their performance and calibration may degrade substantially across independent healthcare systems -- underscoring the need for dataset-specific validation and recalibration prior to clinical application. AUTHOR SUMMARY Opioid medications are commonly used in hospitals to treat pain, but some patients receive very high doses, which may increase their risk of long-term opioid use and dependence. Identifying these high-risk patients early during hospitalization could help doctors make safer prescribing decisions and improve pain management. In this study, we analyzed electronic health record data from over 220,000 hospital admissions to develop machine learning models that estimate which patients are likely to receive high levels of opioids. We focused on information available within the first 24 hours of admission, including patient characteristics, laboratory testing patterns, and procedures. We also examined whether information from clinical notes could improve predictions. We found that the models performed well within the original dataset, and that combining structured data with clinical notes improved performance. The patterns identified by the models -- such as links to surgical procedures and more intensive care -- were in line with expectations. However, when the models were tested in different hospital datasets, performance declined, and predicted risks no longer matched what actually happened to patients. These findings highlight an important challenge in applying machine learning in healthcare: models that perform well in one setting may not work reliably in others. While routinely collected hospital data may help identify patients at risk early, predictive models must be carefully tested and adapted before they can be safely used in new clinical environments.
Ähnliche Arbeiten
Global burden of disease attributable to mental and substance use disorders: findings from the Global Burden of Disease Study 2010
2013 · 6.027 Zit.
CDC Guideline for Prescribing Opioids for Chronic Pain—United States, 2016
2016 · 5.145 Zit.
Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations
1997 · 4.272 Zit.
The fifth edition of the addiction severity index
1992 · 4.249 Zit.
Detecting alcoholism. The CAGE questionnaire
1984 · 3.996 Zit.