Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Accuracy of the GPT-5 Mini in Predicting Six-Week Postoperative Knee Flexion Following Total Knee Replacement: A Retrospective Cohort Study
0
Zitationen
7
Autoren
2026
Jahr
Abstract
BACKGROUND AND OBJECTIVE: Artificial intelligence (AI) models such as ChatGPT are increasingly explored for clinical prediction, yet their accuracy in forecasting early functional outcomes after total knee replacement (TKR) remains unclear. This study aims to evaluate the accuracy of the ChatGPT platform via the GPT-5 mini model (OpenAI, San Francisco, CA, USA) in predicting six-week postoperative knee flexion following TKR and assess whether patient factors influence prediction error. METHODS: This retrospective cohort study included 160 patients who underwent TKR at a UK tertiary center. Age, sex, BMI, diabetic status, smoking status, American Society of Anaesthesiologists (ASA) grade, and six-week postoperative knee flexion were extracted from electronic records. The GPT-5 mini generated predicted flexion values using a standardized prompt. Predicted and actual flexion were compared using the Wilcoxon signed-rank test. Agreement was evaluated using Bland-Altman analysis. Subgroup analyses assessed age, diabetes, smoking, ASA grade, and BMI. Results: Median actual flexion was 95°, while the median GPT-5 mini predicted flexion was 103° (p < 0.0001). Median absolute error was 10°. Significant overestimation occurred across most age groups, diabetic and non-diabetic patients, smokers and non-smokers, and all ASA grades. Absolute error differed significantly by ASA grade (I: 17°, II: 9°, III: 6°, p < 0.001). The BMI showed no association with prediction error. Conclusion: The GPT-5 mini overestimated six-week postoperative flexion, with the greatest inaccuracies occurring in younger, healthier patients and smaller errors observed in those with higher comorbidity burden. Thus, GPT-5-mini is not reliable and should not be used clinically without rigorous validation on institution-specific datasets.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.740 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.649 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.202 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.886 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.