Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Revealing the Impact of Pre-training Data on Medical Foundation Models
2
Zitationen
28
Autoren
2025
Jahr
Abstract
<title>Abstract</title> Medical foundation models (FM), pre-trained on large-scale unlabelled data, have demonstrated robust performance and high efficiency when fine-tuned to various clinically relevant applications. However, the impact of pre-training data on medical FM performance such as generalisability and fairness, which form the foundation in fine-tuned models, remains unexplored. To address this, we sampled two large cohorts from two sites, Moorfields Eye Hospital (UK) and the Shanghai Diabetes Prevention Program (China), each containing 904,170 retinal images for FM pre-training. We developed parallel FMs using identical processes and compared their fairness and generalisability on downstream tasks with publicly available datasets and held-out data from each site. Our results demonstrate that, despite strong generalisability, medical FMs perform significantly better on downstream data that align with the pre-training data in approximately one-third of tasks. Additionally, age is a key metadata factor impacting FM fairness and generalisability in retinal images, whereas sex and ethnicity show no such impact. These findings advocate for an evidence-based approach to pre-training data selection and highlight the importance of transparency even for pre-training data, ultimately enhancing FM capabilities and guiding FM development and customised application in healthcare.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.198 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.576 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.382 Zit.
Autoren
- Yukun Zhou
- Zheyuan Wang
- Yilan Wu
- Ariel Yuhan Ong
- Siegfried K. Wagner
- Eden Ruffell
- Mark A. Chia
- Zhouyu Guan
- Lie Ju
- Justin Engelmann
- David A. Merle
- Tingyao Li
- Jia Shu
- Paul Nderitu
- Ke Zou
- Jocelyn Hui Lin Goh
- Qingshan Hou
- Xiaoxuan Liu
- Ya Xing Wang
- Yih Chung Tham
- André Altmann
- Carol Y. Cheung
- Daniel C. Alexander
- Eric J. Topol
- Alastair K. Denniston
- Tien Yin Wong
- Bin Sheng
- Pearse A. Keane
Institutionen
- University College London(GB)
- Shanghai Jiao Tong University(CN)
- Beijing Academy of Artificial Intelligence(CN)
- Tsinghua University(CN)
- Institute of Ophthalmology(MX)
- The Royal Victorian Eye & Ear Hospital(AU)
- Moorfields Eye Hospital NHS Foundation Trust(GB)
- Moorfields Eye Hospital(GB)
- National University of Singapore(SG)
- Singapore National Eye Center(SG)
- Singapore Eye Research Institute(SG)
- University of Birmingham(GB)
- Chinese University of Hong Kong(HK)
- Scripps Research Institute(US)