Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
BEDS-Bench: Behavior of EHR-models under Distributional Shift--A\n Benchmark
5
Zitationen
6
Autoren
2021
Jahr
Abstract
Machine learning has recently demonstrated impressive progress in predictive\naccuracy across a wide array of tasks. Most ML approaches focus on\ngeneralization performance on unseen data that are similar to the training data\n(In-Distribution, or IND). However, real world applications and deployments of\nML rarely enjoy the comfort of encountering examples that are always IND. In\nsuch situations, most ML models commonly display erratic behavior on\nOut-of-Distribution (OOD) examples, such as assigning high confidence to wrong\npredictions, or vice-versa. Implications of such unusual model behavior are\nfurther exacerbated in the healthcare setting, where patient health can\npotentially be put at risk. It is crucial to study the behavior and robustness\nproperties of models under distributional shift, understand common failure\nmodes, and take mitigation steps before the model is deployed. Having a\nbenchmark that shines light upon these aspects of a model is a first and\nnecessary step in addressing the issue. Recent work and interest in increasing\nmodel robustness in OOD settings have focused more on image modality, while the\nElectronic Health Record (EHR) modality is still largely under-explored. We aim\nto bridge this gap by releasing BEDS-Bench, a benchmark for quantifying the\nbehavior of ML models over EHR data under OOD settings. We use two open access,\nde-identified EHR datasets to construct several OOD data settings to run tests\non, and measure relevant metrics that characterize crucial aspects of a model's\nOOD behavior. We evaluate several learning algorithms under BEDS-Bench and find\nthat all of them show poor generalization performance under distributional\nshift in general. Our results highlight the need and the potential to improve\nrobustness of EHR models under distributional shift, and BEDS-Bench provides\none way to measure progress towards that goal.\n
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.314 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.684 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.211 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.614 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.411 Zit.