Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Low responsiveness of machine learning models to critical or deteriorating health conditions
7
Zitationen
7
Autoren
2025
Jahr
Abstract
BACKGROUND: Machine learning (ML) based mortality prediction models can be immensely useful in intensive care units. Such a model should generate warnings to alert physicians when a patient's condition rapidly deteriorates, or their vitals are in highly abnormal ranges. Before clinical deployment, it is important to comprehensively assess a model's ability to recognize critical patient conditions. METHODS: We develop multiple medical ML testing approaches, including a gradient ascent method and neural activation map. We systematically assess these machine learning models' ability to respond to serious medical conditions using additional test cases, some of which are time series. Guided by medical doctors, our evaluation involves multiple machine learning models, resampling techniques, and four datasets for two clinical prediction tasks. RESULTS: We identify serious deficiencies in the models' responsiveness, with the models being unable to recognize severely impaired medical conditions or rapidly deteriorating health. For in-hospital mortality prediction, the models tested using our synthesized cases fail to recognize 66% of the injuries. In some instances, the models fail to generate adequate mortality risk scores for all test cases. Our study identifies similar kinds of deficiencies in the responsiveness of 5-year breast and lung cancer prediction models. CONCLUSIONS: Using generated test cases, we find that statistical machine-learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. Most of the ML models tested fail to respond adequately to critically ill patients. How to incorporate medical knowledge into clinical machine learning models is an important future research direction.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.605 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.522 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.874 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.443 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.941 Zit.