Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Machine Learning Approaches for Cardiovascular Disease Prediction: A Comparative Study
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Abstract Cardiovascular disease (CVD) is a leading cause of death globally, particularly in South Asia, where high cholesterol intake contributes to its prevalence. This study aims to predict CVD using machine learning models applied to patient data from healthcare systems in Dhaka, Bangladesh. The study seeks to identify the most reliable model for early diagnosis and decision-making support. The dataset comprises 1019 patient records, collected from two prominent hospitals in Dhaka, and includes nine critical features. Several machine learning models were employed and rigorously tested using stratified fivefold cross-validation and best parameters were chosen using GridSearchCV. The model’s performance was evaluated using metrics such as accuracy, precision, recall, F1-score, AUC and ROC curve analysis. To enhance interpretability, SHapley Additive exPlanations (SHAP) analysis was applied, focusing on global feature importance. Among the models tested, XGBoost exhibited the highest performance, achieving 97.12% training accuracy and 86.07% testing accuracy (AUC = 0.91). Random Forest also performed strongly with 83.08% testing accuracy and the highest AUC (0.92). Decision Tree and K-Nearest Neighbors achieved moderate results with testing accuracies of 78% and 74.63%, respectively. Logistic Regression and Support Vector Machine showed lower overall accuracy (~ 66%), though both attained high recall (0.91 and 0.95), indicating sensitivity to positive cases. These results highlight XGBoost’s robustness while also demonstrating the trade-offs of alternative models. The results demonstrate that machine learning, particularly the XGBoost model with SHAP based explainability, offers a promising approach for diagnosing CVD with high accuracy. Incorporating this model into medical diagnostic systems can assist healthcare specialists in making more informed and accurate decisions, potentially reducing the morbidity and mortality associated with CVD.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.445 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.602 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.103 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.061 Zit.