Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Hypertension Risk Prediction Using Stacking Ensemble of CatBoost, XGBoost, and LightGBM: A Machine Learning Approach
0
Zitationen
3
Autoren
2025
Jahr
Abstract
Hypertension is a leading cause of cardiovascular diseases, chronic kidney failure, and strokes, affecting millions worldwide. Early detection and accurate risk prediction are crucial for effective management and prevention. This study aims to evaluate and compare the performance of different algorithms for predicting hypertension risk using a stacking ensemble approach. The model combines three gradient boosting algorithms XGBoost, LightGBM, and CatBoost as base learners, with Logistic Regression as the meta learner. The dataset, sourced from Kaggle, contains 4,240 instances with demographic and clinical attributes relevant to hypertension. The preprocessing steps included imputing missing values using the median, removing residual null entries, and addressing class imbalance through the SMOTE algorithm. Data were divided into 80% for training and 20% for testing. The evaluation showed that the stacking ensemble model achieved an overall accuracy of 92,65%, with precision, recall, and F1-scores consistently reaching 0.92 for both classes. The confusion matrix revealed minimal misclassification, indicating the model’s strong ability to differentiate between low and high risk individuals. These results emphasize that the primary goal of this research is to identify which algorithm provides the best performance for hypertension risk prediction. By evaluating and comparing different models, this study offers insights into choosing the most effective algorithm for clinical decision-making and early detection strategies.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.445 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.596 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.102 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.061 Zit.