Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Generative Adversarial Imputation Networks (GAIN) to Handle Missing Clinical Data for IVF Success Prediction
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Missing data are a common challenge for machine learning (ML) approaches that provide clinical decision support, especially for complex medical outcomes such as In Vitro Fertilization (IVF) treatment prediction. Imputation methods have been studied to handle missing data, and shown to provide performance improvement of prediction models. Recently, Generative Adversarial Imputation Networks (GAIN) have been successfully applied to handle missing values in clinical data. This study implements and evaluates four imputation methods: a statistical, single imputation method and three ML-based imputation approaches, being KNN, MissForest, and GAIN. Imputation was applied to the full dataset, and the resulting complete data was then used to construct three task-specific datasets: blastocyst, pregnancy, and birth. Datasets are built based on a IVF treatment database of over 300 patients, and composed of over 50 discriminative features. Prediction models are built based on a 10-fold cross validation split and five classification algorithms: MLP, Random Forest, XGBoost, Gradient Boosting, and Bayesian Logistic. After applying imputation methods, feature value distributions remained largely consistent when compared to the distributions prior to imputation. Results suggest that GAIN imputation leads to effective convergence during model training, as both generator and discriminator losses decrease considerably. GAIN imputation outperformed all other ML-based imputation methods for the three prediction tasks. While single imputation yield F1 metrics for blastocyst, pregnancy, and birth of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$0.50,0.56,0.13$</tex>, GAIN yield <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$0.74,0.75,0.76$</tex>, respectively for the negative class; for the positive class single imputation yield F1 metrics of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$0.67,0.45,0.81$</tex>, while GAIN yield <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$0.76,0.51,0.85$</tex> respectively. Improvement of classification model performances after imputation suggest that a generative imputation approach might be beneficial to the task at hand.
Ähnliche Arbeiten
Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome
2004 · 8.959 Zit.
Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome (PCOS)
2003 · 5.958 Zit.
Variations in the Pattern of Pubertal Changes in Boys
1970 · 5.114 Zit.
A Critical Evaluation of Simple Methods for the Estimation of Free Testosterone in Serum
1999 · 3.676 Zit.
Clinical longitudinal standards for height, weight, height velocity, weight velocity, and stages of puberty.
1976 · 3.142 Zit.