Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Precision vs recall in surgical AI: Performance and limitations of tabular foundation model for perioperative classification and risk prediction
0
Zitationen
4
Autoren
2026
Jahr
Abstract
Background: The Tabular Prior-data Fitted Network (TabPFN) is a recently introduced Transformer-based foundation model designed specifically for structured tabular data. TabPFN enables task inference without the need for hyperparameter tuning or extensive data preprocessing. Despite its disruptive potential, the application of TabPFN in surgical data science remains unexplored. In this study, we evaluate the performance of TabPFN across six surgical classification tasks. Objective: To assess TabPFN's performance against benchmark machine learning models in surgical classification tasks and identify optimal application scenarios based on sample size and outcome incidence characteristics. Methods: In this study, perioperative data from two independent medical centres were utilized, comprising a large-scale cohort (n=67,134) and a medium-scale cohort (n=6,888). Six clinically relevant classification tasks were developed. The performance of TabPFN was systematically compared to benchmark models including XGBoost, Random Forest, Support vector machine, Logistic regression, and Decision tree, using area under the receiver operating characteristic curve, accuracy, precision, recall, F1-score, and calibration. Results: In tasks with large-sample sizes (n > 3,000) and higher outcome incidence (>40%), TabPFN achieved the highest recall and F1-score among all models. For tasks with low outcome incidence (<20%), TabPFN attained the highest precision. Calibration analysis demonstrated that TabPFN provided reliable probability estimates in large-sample tasks (n > 3,000), but its calibration performance declined noticeably in low outcome incidence (<20%). Conclusions: TabPFN represents a promising methodological approach for tabular modelling in surgical data science. It also shows considerable promise in tasks with sample sizes exceeding 10,000. However, it is not yet capable of fully replacing established benchmark models. The application scenarios of TabPFN should be selected based on key task-specific characteristics. Further targeted training is necessary in future.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.549 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.443 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.941 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.792 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.