Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Multi-criteria evaluation of clinical decision-making performance in spinal neurosurgery and physical therapy scenarios: A comparative analysis of artificial intelligence models
1
Zitationen
5
Autoren
2026
Jahr
Abstract
The integration of AI in healthcare, particularly in clinical decision-making, has shown promising results. This study focuses on evaluating the performance of GPT-4 and GPT-3.5, two advanced AI models, in the context of spinal neurosurgery and physiotherapy, areas that require precise and dynamic decision-making. We conducted a prospective, observational study with 64 participants, including neurosurgeons and physiotherapists, who evaluated AI-generated responses for 10 detailed clinical scenarios. The assessment criteria included diagnostic accuracy, treatment suitability, surgical technique detail, and rehabilitation planning. Each scenario was meticulously crafted to reflect common yet complex clinical situations. The study revealed that the GPT-4 consistently outperformed the GPT-3.5 across all the evaluated criteria, with the most significant differences observed in treatment suitability and rehabilitation planning. Statistical analyses, including paired t tests and ANOVA, confirmed the superiority of the GPT-4, highlighting its advanced language processing capabilities and broader medical knowledge base. Reliability analyses further supported these findings. Cronbach’s alpha values indicated moderate internal consistency for GPT-4 (α = 0.344) and lower consistency for GPT-3.5 (α = 0.133). Additionally, Cohen’s Kappa values demonstrated moderate agreement for GPT-4 (κ = 0.65) and fair agreement for GPT-3.5 (κ = 0.48), further validating the reliability of the participants’ evaluations. While the GPT-4 has significant potential as a clinical decision support tool, especially in complex and multidisciplinary fields such as spinal neurosurgery and physiotherapy, its recommendations should be carefully integrated with clinical expertise. Further research is essential to enhance its application and ensure that AI can effectively support dynamic medical environments.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.