Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Bandit Processes and Dynamic Allocation Indices
1.512
Zitationen
1
Autoren
1979
Jahr
Abstract
Summary The paper aims to give a unified account of the central concepts in recent work on bandit processes and dynamic allocation indices; to show how these reduce some previously intractable problems to the problem of calculating such indices; and to describe how these calculations may be carried out. Applications to stochastic scheduling, sequential clinical trials and a class of search problems are discussed.
Ähnliche Arbeiten
Adam: A Method for Stochastic Optimization
2014 · 84.458 Zit.
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
1997 · 19.901 Zit.
No free lunch theorems for optimization
1997 · 13.635 Zit.
Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)
2017 · 11.235 Zit.
Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions
2005 · 10.129 Zit.