Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Who predicts better? Human intelligence vs. artificial intelligence in surgical risk stratification for congenital heart surgery
0
Zitationen
15
Autoren
2026
Jahr
Abstract
Abstract Background/Introduction Few risk models are specifically designed for adults with congenital heart disease. The GUCH and PEACH scores support personalized perioperative risk evaluation. The manual calculation of these scores is tedious; platforms like ChatGPT (CGPT) and DeepSek (DS) could help accelerate this process. Purpose To determine the level of correlation between human assessment and the use of various artificial intelligence platforms for calculating risk scores in congenital heart surgery. Methods A Pearson correlation analysis was used to compare original and AI-assisted scoring systems. Results A total of 59 patients were analyzed. Mean age distribution was 32.7 ± 11.5 years. Weight (64.5 ± 16.2 kg), BMI (24.8 ± 4.9 kg/m²), and height (1.60 ± 0.10 m) followed near-normal distributions. The median of oxygen saturation was 95%. Hemoglobin levels (14.5 ± 2.8 g/dL) and ventricular ejection fractions (FEVD: 40.1 ± 10.6%, FEVI: 53.9 ± 12.6%) were normally distributed. Most patients were in NYHA class II (62.7%), followed by class III (22%) and only one patient in class IV. A correlation study was conducted to compare manually calculated congenital cardiac surgery risk scores (GUCH and PEACH) with those generated using artificial intelligence platforms CGPT 4.0 and DS. -GUCH vs. GUCH CGPT 4.0: Pearson correlation coefficient r = 0.31, p = 0.0174, indicating a weak but statistically significant positive correlation. -GUCH vs. GUCH DS: Pearson r = 0.708, p < 0.001, showing a strong and statistically significant positive correlation. -PEACH vs. PEACH CGPT 4.0: Pearson r = 0.37, p = 0.005, reflecting a moderate and statistically significant positive correlation. -PEACH vs. PEACH DS: Pearson r = 0.263, p = 0.041, demonstrating a weak but statistically significant positive correlation. Discussion The correlation analyses suggest varying degrees of agreement between the original scoring systems and their AI-assisted counterparts. The strong correlation between GUCH and GUCH DS (r = 0.708, p < 0.001) indicates a high degree of consistency, supporting the potential interchangeability or reliability of the DS-derived scores in this context. Conversely, the weaker correlations observed with CGPT 4.0 versions of both GUCH (r = 0.31) and PEACH (r = 0.37), though statistically significant, imply that these models capture overlapping but not identical constructs or decision patterns. The modest association between PEACH and PEACH DS (untrained) (r = 0.263) further highlights the variability introduced by untrained scoring systems and underscores the importance of calibration or training in improving concordance. Conclusion The findings highlight the potential of AI-assisted tools in replicating manually calculated risk scores, with varying levels of accuracy. DeepSek, in particular, demonstrated strong agreement with manual GUCH scores. However, these tools require further validation and refinement before being adopted in clinical practice.Main Results
Ähnliche Arbeiten
Heart Disease and Stroke Statistics—2012 Update
2011 · 7.221 Zit.
2015 ESC/ERS Guidelines for the diagnosis and treatment of pulmonary hypertension
2015 · 6.921 Zit.
The incidence of congenital heart disease
2002 · 6.015 Zit.
Burden of valvular heart diseases: a population-based study
2006 · 4.751 Zit.
Updated Clinical Classification of Pulmonary Hypertension
2013 · 4.183 Zit.