OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.03.2026, 13:57

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Healthcare-Oriented Machine Learning Framework for Early Detection of Type 2 Diabetes

2026·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2026

Jahr

Abstract

Type 2 Diabetes Mellitus (T2DM) is a major health issue of critical importance on the international level, and Bangladesh has the highest incidence rate of 13.2% of the disease among adults. Although machine learning (ML) has been developed to predict T2DM, current research has several drawbacks, such as small samples, a lack of clinical factors, extreme levels of class imbalance, etc. This research constructs an optimally implemented ML model to predict T2DM in Bangladesh based on an enhanced dataset of 485 patients, including a complete set of demographic and biochemical characteristics. We evaluated eight ML models, including Logistic Regression, K-Nearest Neighbors, Support Vector Classifier, Decision Tree, Random Forest, Gradient Boosting, AdaBoost, and XGBoost, with rigorous GridSearchCV hyperparameter tuning. To address class imbalance, we applied five balancing techniques: SMOTE, Random Oversampling, Random Undersampling, ADASYN, and SMOTETomek. Performance was measured using Accuracy, Precision, Recall, F1 Score, Specificity, Matthews Correlation Coefficient, Kappa, Confusion Matrix, and ROC-AUC curves. After hyperparameter tuning, Random Forest achieved the best performance with <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$98.21 \% \pm 0.0791$</tex> accuracy, <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$98.47 \% \pm 0.0724$</tex> precision, and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$98.43 \% \pm 0.0787 ~\mathrm{F} 1$</tex> score, with an AUC of 0.98. Among balancing techniques, Random Oversampling with Random Forest produced a superior performance with 99.50% accuracy, and 99.50% F1 score. Our approach effectively surpasses the previous limitations by integrating balanced datasets with optimized models and comprehensive consideration that can offer a solid practice in predicting T2DM at the early stage and implementing clinical intervention in Bangladesh.

Ähnliche Arbeiten