Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Introduction to Supervised Machine Learning
26
Zitationen
3
Autoren
2021
Jahr
Abstract
The common conception and criticism of machine learning (ML) in medicine is that it centers around a “black box,” an inscrutable series of mathematical calculations that take in data and spit out predictions, lacking the pathobiological explanatory rigor to which medical researchers are accustomed. Although this is oversimplified, it is not altogether untrue. It is also not really a problem. The true place for ML in medicine is not in explanation, but in prediction. The broadest definitions of ML describe processes that take in historical data, learn salient relationships, and use that knowledge to make predictions on new data. To that end, ML algorithms have been present in medicine for decades—hiding in plain sight in the form of, for example, logistic regression and Cox proportional hazard models. Although these algorithms fall within the category of supervised ML (there is a single target outcome, such as death, to predict), they are somewhat special in that they are readily interpretable and assign a “weight” to the various parameters in the model. Indeed, researchers often over interpret these weights, suggesting they imply a causal mechanism as opposed to a mere association (1). Modern ML algorithms reorient the central goal to flexibly predicting outcomes for new data as closely as possible. This allows us to ease many of the strong assumptions behind classic models, permitting the connection between covariates and an outcome to be mediated by any black-box algorithm, saving considerations of interpretability and plausibility for post-hoc discussions (2) In this communication, we hope to orient readers to the techniques, mechanisms, and purpose of supervised ML, which has a goal of predicting outcomes. For ML strategies to work, the entire data lifecycle must be carefully considered. The pipeline naturally begins with data selection, which includes cohort selection, but further aims to clarify which aspects …
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.732 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.547 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.949 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.550 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.061 Zit.