Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ExCaPT: Explainable Cancer Prediction with Transformer-based models
0
Zitationen
7
Autoren
2025
Jahr
Abstract
Abstract Cancer remains one of the most significant global health challenges. De-spite advances in treatment, early detection remains a critical concern. The increasing availability of Electronic Health Records (EHR) offers a unique opportunity to enhance our understanding of patient health trajectories and develop more accurate risk prediction models. However, the complexity and heterogeneity of EHR data pose significant challenges for analysis and modeling. Over the years, a range of models, from traditional machine learning to advanced deep learning (DL) approaches, have been employed to address the multidimensional complexities of health data. Notably, transformer-based models have emerged as a promising solution for capturing longitudinal, sequential, and multimodal data. This work introduces ExCaPT, a transformer encoder-based predictive model designed to identify individuals at higher risk of developing colorectal cancer (CRC), while providing interpretable outputs. The model leverages a comprehensive dataset, incorporating features such as age, sex, smoking status, and longitudinal EHR data including disease and drug trajectories. ExCaPT had good performance in a test dataset with a ROC-AUC of 85.9 ± 0.1, 68.1 ± 0.3 sensitivity and 82.2 ± 0.1 specificity. These results outperform those of an LSTM model, used as reference for sequence data. This highlights the potential of transformer-based models in the early identification of high-risk cancer patients, marking an important step forward in the field of precision healthcare. Additionally, we employed several explainability approaches, including attention-based, embedding-based, and integrated gradients analyses, which allowed us to identify key input features, visualize latent representations, and quantify the contributions of different features to the predictions, providing complementary insights into the model’s decision-making process. Highlights ExCaPT, a transformer-encoder model using demographic, disease, and medication trajectories from EHR data, predicts early colorectal cancer (CRC) risk with high performance. The model provides interpretability through attention scores, embedding-based analyses, and integrated gradients to reveal influential features. This modeling framework is generalizable and can be extended to pre-diction for other cancer types. GRAPHICAL ABSTRACT
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.179 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.561 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.071 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.429 Zit.
Analysis of Survival Data.
1985 · 4.379 Zit.