Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
AI decision-making performance in Maternal-Fetal Medicine compared with human specialists: a cross-sectional study
0
Zitationen
9
Autoren
2025
Jahr
Abstract
Background : Large Language Models (LLMs), such as ChatGPT-4 and Gemini, are increasingly used in clinical care; however, reliability in Maternal–Fetal Medicine (MFM) remains uncertain. Objective : evaluating alignment of AI case management recommendations with those of MFM specialists, focusing on accuracy, agreement and clinical relevance. Study Design & Setting: Cross-sectional study, online blinded evaluation, November–December 2024 Sample & methods: 20 hypothetical MFM cases were developed. Responses were generated by ChatGPT-4, Gemini, and three MFM specialists, then rated by 22 blinded board-certified MFM evaluators using a 10-point Likert scale. Agreement was assessed with Spearman’s rho (ρ) and Cohen’s (κ); accuracy differences with Wilcoxon signed-rank tests. Outcomes : ChatGPT-4 showed moderate alignment (mean 6.6 ± 2.95; ρ=0.408; κ=0.232, p<0.001), performing well in routine cases. Gemini scored 7.0 ± 2.64 showing negligible inter-rater reliability (κ=−0.024, p=0.352). No significant difference found between ChatGPT-4 and clinicians (p=0.18), while Gemini was less accurate (p<0.01). Conclusions : AI demonstrates potential in routine MFM decision-making but remains limited in complex scenarios, requiring caution .
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.764 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.674 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.234 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.