Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
427 Patient Perspectives On AI: A Pilot Study Comparing Large Language Model and Physician-Generated Responses to Routine Cervical Spine Surgery Questions
1
Zitationen
4
Autoren
2025
Jahr
Abstract
INTRODUCTION: Anterior cervical discectomy and fusion (ACDF) surgery is a common intervention for patients with cervical spine pathologies. The complexity of ACDF surgery and varied quality of online health information pose a challenge for patients attempting to understand the surgery and its outcomes through virtual resources. METHODS: This cross-sectional study involved three phases. Phase 1 entailed composing 10 commonly asked questions regarding ACDF surgery with the assistance of ChatGPT-3.5, ChatGPT-4.0, and Google search. Phase 2 involved collecting responses to the questions from two spine surgeons and then prompting ChatGPT-3.5 and Gemini to answer the same 10 questions. Phase 3 involved recruiting cervical spine surgery patients (n=5) and age-matched controls (n=5) to evaluate the responses provided by both surgeons and two LLM platforms on clarity and completeness. RESULTS: LLM-generated responses were significancy shorter, on average, than physician-generated responses (30.0 +/- 23.5 vs 153.7 +/- 86.7 words, p<0.001). Study participants were more likely to rate LLM-generated responses with more positive clarity ratings (H=6.25, p=0.012), with no significant difference in completeness ratings (H=0.695, p=0.404). On an individual question basis, there were no significant differences in ratings given to LLM versus physician-generated responses. Compared with age-matched controls, cervical spine surgery patients were more likely to rate physician-generated responses as higher in clarity (H=6.42, p=0.011) and completeness (H=7.65, p=0.006). CONCLUSIONS: Although a small sample size, our findings indicate that LLMs offer comparable, and occasionally preferred, information in terms of clarity and comprehensiveness of responses to common ACDF surgery questions. It is particularly striking that ratings were similar considering LLM-generated responses were, on average, 80% shorter than physician responses. Further studies are needed to determine how LLMs can be integrated into spine surgery education moving forward.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.