Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparing conventional statistical models and machine learning in a small cohort of South African cardiac patients
25
Zitationen
6
Autoren
2022
Jahr
Abstract
Machine learning is used to process big data volumes with complex non-linear relationships between predictive variables and predictions. Research into the usefulness of machine learning in small data volumes remains limited. To compare conventional statistical methods and machine learning to predict angiogram outcomes in a small cohort of South African cardiac patients. This is a retrospective study on patients with cardiac risk factors at Inkosi Albert Luthuli Central Hospital, Durban, South Africa, from 2002 to 2008. Models were designed using predictive risk factors to forecast a binary angiogram outcome (normal or abnormal) by applying conventional statistical models (binary logistic and log binomial) and stacking ensemble machine learning. The outcome prevalence of abnormal angiograms was 99/173 (57%). Predictive data was used to model this outcome. The binary logistic regression model, which estimates odds ratio, was unsuitable. The log binomial model, which estimates relative risk, did not converge after various stepwise modelling attempts. Thereafter, machine learning models were used. These included logistic regression, k-nearest neighbour, decision tree, support vector machine, and naïve Bayes. The ensemble model amalgamated all algorithms and showed accuracy >70% and excellent performance at different thresholds with an area under the curve (AUC) > 80%. The logistic regression model was unsuitable because an odds ratio would have been unreliable and overestimated the true effect since the outcome prevalence was >10%. A log binomial model with relative risk estimates did not converge, possibly owing to the multiple predictive variables. Overall, conventional statistical models were unsuccessful in this instance. Machine learning models had limitations from a small dataset. However, the combined modelling with the stacking ensemble method produced good results in the small, homogenous database by exploiting the strengths of each contributing algorithm. Researchers may apply machine learning when conventional statistical models are inconclusive in homogenous small databases with multiple variables and a complex relationship to the outcome. Machine learning is a viable option even with relatively small cohorts if the number of predictive variables is also small.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.704 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.545 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.931 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.532 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.046 Zit.