Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A Biologically Plausible Benchmark for Contextual Bandit Algorithms in Precision Oncology Using in vitro Data
2
Zitationen
5
Autoren
2019
Jahr
Abstract
Precision oncology, the genetic sequencing of tumors to identify druggable targets, has emerged as the standard of care in the treatment of many cancers. Nonetheless, due to the pace of therapy development and variability in patient information, designing effective protocols for individual treatment assignment in a sample-efficient way remains a major challenge. One promising approach to this problem is to frame precision oncology treatment as a contextual bandit problem and to apply sequential decision-making algorithms designed to minimize regret in this setting. However, a clear prerequisite for considering this methodology in high-stakes clinical decisions is careful benchmarking to understand realistic costs and benefits. Here, we propose a benchmark dataset to evaluate contextual bandit algorithms based on real in vitro drug response of approximately 900 cancer cell lines. Specifically, we curated a dataset of complete treatment responses for a subset of 7 treatments from prior in vitro studies. This allows us to compute the regret of proposed decision policies using biologically plausible counterfactuals. We ran a suite of Bayesian bandit algorithms on our benchmark, and found that the methods accumulate less regret over a sequence of treatment assignment tasks than a rule-based baseline derived from current clinical practice. This effect was more pronounced when genomic information was included as context. We expect this work to be a starting point for evaluation of both the unique structural requirements and ethical implications for real-world testing of bandit based clinical decision support.
Ähnliche Arbeiten
Adam: A Method for Stochastic Optimization
2014 · 84.467 Zit.
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
1997 · 19.987 Zit.
No free lunch theorems for optimization
1997 · 13.692 Zit.
Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)
2017 · 11.254 Zit.
Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions
2005 · 10.149 Zit.