Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

S3007 Development and Validation of an Electronic Health Record Machine Learning Model for Cumulative Upper Gastrointestinal Cancer Risk Prediction

2025·0 Zitationen·The American Journal of Gastroenterology

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Introduction: Upper gastrointestinal (UGI) cancers are predominantly diagnosed at advanced stages and disproportionately impact minority and immigrant populations. Existing cancer-specific prevention strategies underestimate the potential benefits of a unified endoscopic screening for all UGI cancers. We evaluated the viability of a novel UGI cancer risk assessment model using routinely collected electronic health record variables. Methods: We conducted a large-scale retrospective cohort study of adults 40-85 years old with at least 1 outpatient or inpatient encounter in the Mount Sinai Health System (MSHS) between 2011 and 2022. Patients with a cancer registry confirmed UGI cancer diagnosis within 1 year of their initial encounter or without histological confirmation were excluded. Routinely collected demographic, social, clinical and laboratory data were extracted from structured electronic health record fields. Time-dependent variables were extracted from the first encounter. Cox proportional hazards (CoxPH), logistic regression, and XGBoost models were developed for 3-, 5-, and 10-year risk horizons using an 80/20 train-validation split stratified by cancer diagnosis. In each model, features were selected based on univariate analysis, with significance defined as a P-value < 0.05. The best model was selected based on AUROC on the validation set. Performance metrics were calculated from the optimal risk percentile defined by the Youden index. Results: Our final cohort consisted of 1,745,288 patients, of which 313 developed UGI cancer after a median follow-up time of 10 years. Cohort demographics are shown in Table 1. The best performing model was a CoxPH model and achieved an AUROC of 0.929 for a 5 year horizon. The strongest predictors were PPI use, peptic ulcer disease, head and neck cancer, and male sex. At the optimal 92nd risk percentile, the model achieved a sensitivity of 81.8%, specificity of 92.4%, and number needed to screen of 754 patients. Conclusion: We internally validated a UGI cancer risk model that uses 15 readily available predictors from the initial patient encounter. The model demonstrated acceptable discrimination performance and has potential to inform unified screening strategies. Further work is needed to externally validate the model.

Autoren

Themen

Machine Learning in HealthcareCardiovascular Health and Risk FactorsArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

S3007 Development and Validation of an Electronic Health Record Machine Learning Model for Cumulative Upper Gastrointestinal Cancer Risk Prediction

Abstract

Ähnliche Arbeiten

Autoren

Themen