OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.03.2026, 05:59

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Machine learning prediction of thyroid cancer recurrence for early screening and clinical decision pathways: a retrospective cohort study

2026·0 Zitationen·Discover OncologyOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2026

Jahr

Abstract

Recurrence prediction in differentiated thyroid carcinoma (DTC) remains clinically challenging despite generally favorable outcomes and well-established treatment strategies. Improving early identification of patients at elevated recurrence risk may enhance individualized surveillance and therapeutic decision-making. This study evaluated the performance and clinical utility of machine-learning models for recurrence prediction using routinely collected clinicopathologic features from a publicly available cohort of 383 patients with long-term follow-up.16 variables were initially analyzed following international guideline definitions. Random Forest, XGBoost, and LightGBM models were developed using stratified training-test splits, SMOTE for class-imbalance correction, fivefold cross-validation, probability calibration, and decision-curve analysis. Shapley Additive Explanations (SHAP) was applied to quantify global and local feature contributions and to derive simplified feature subsets. Models trained with 4, 6, 8, and full feature sets were compared to assess the impact of dimensionality reduction on discrimination and interpretability. Full-feature models achieved strong performance, with Random Forest obtaining the highest AUC (0.931). Notably, a compact 4-feature Random Forest model-including Risk, N stage, T stage, and Age-maintained high discriminatory ability (AUC 0.913; accuracy 0.862; recall 0.750), demonstrating that substantial simplification preserved predictive value. Performance improvements plateaued beyond 6-8 features, indicating limited incremental benefit from larger feature sets. SHAP analysis consistently identified Risk, N, T, and Age as dominant predictors. These findings highlight that streamlined, interpretable ML models using a small number of clinically accessible features can provide accurate and explainable recurrence prediction in DTC. Such models offer advantages in computational efficiency, transparency, and real-world deployability, supporting their potential integration into electronic health record systems or point-of-care decision tools. Future work should prioritize multicenter external validation and incorporation of additional pathological or molecular markers to enhance generalizability and clinical applicability.

Ähnliche Arbeiten