Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Development and Temporal Validation of Explainable Machine Learning Models for Predicting Vitamin B12 Deficiency Using Routine Laboratory Analytes
0
Zitationen
4
Autoren
2026
Jahr
Abstract
<b>Background/Objectives:</b> Vitamin B12 deficiency is a prevalent yet frequently underdiagnosed condition, largely due to the limited diagnostic accuracy of serum total B12 and the restricted availability of confirmatory biomarkers such as holotranscobalamin and methylmalonic acid. This study aimed to develop and validate explainable machine learning (ML) models capable of predicting vitamin B12 deficiency using only routinely available laboratory examinations, thereby supporting early detection within standard diagnostic workflows. <b>Methods:</b> This retrospective study included 51,630 adult patients who underwent concurrent vitamin B12 testing and routine laboratory evaluation between 2015 and 2025. An independent temporal validation cohort of 34,744 patients was used to assess generalizability. Eight supervised ML algorithms were developed within a four-stage experimental framework incorporating default modeling, probability-threshold optimization, hyperparameter tuning, and feature engineering. Model performance was evaluated using AUC-ROC, AUC-PR, sensitivity, specificity, F1 score, accuracy, Matthews correlation coefficient, and likelihood ratios. Model explainability and clinical utility were assessed using SHAP, LIME, and decision curve analysis. <b>Results:</b> Among all algorithms, CatBoost demonstrated the most balanced and clinically relevant performance. In the threshold-optimized configuration, the model achieved a sensitivity of 0.92, specificity of 0.67, F1 score of 0.82, AUC-ROC of 0.88, and AUC-PR of 0.86 in the test set. Temporal validation confirmed robust generalizability, with improved discrimination (AUC-ROC 0.90; AUC-PR 0.91) and stable calibration. Explainability analyses identified hematologic indices (MCV, HGB, HCT, RDW), iron-related markers, inflammatory measurands, and age as the most influential contributors, consistent with known pathophysiology. <b>Conclusions:</b> This study presents a large-scale, explainable, and temporally validated ML framework for predicting vitamin B12 deficiency using routine laboratory data alone. The model demonstrates strong diagnostic performance, biological plausibility, and potential for seamless integration into laboratory and clinical decision-support systems, enabling cost-effective and early identification of patients at risk.
Ähnliche Arbeiten
Construction of a genetic linkage map in man using restriction fragment length polymorphisms.
1980 · 8.380 Zit.
Estimation of total, protein-bound, and nonprotein sulfhydryl groups in tissue with Ellman's reagent
1968 · 7.952 Zit.
‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease?*
2003 · 6.282 Zit.
A candidate genetic risk factor for vascular disease: a common mutation in methylenetetrahydrofolate reductase
1995 · 5.693 Zit.
Mendelian randomization: genetic anchors for causal inference in epidemiological studies
2014 · 4.907 Zit.