Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Application and efficacy of artificial intelligence in patient education on spinal cord injuries
0
Zitationen
13
Autoren
2026
Jahr
Abstract
Spinal cord injuries (SCI) present complex challenges for patients, who increasingly turn to online resources for supplementary information. Large language models (LLMs) like ChatGPT and Google Gemini have emerged as potential tools for patient education. However, concerns about the accuracy, clarity, and comprehensiveness of their responses remain, particularly in specialized fields such as SCI. This study aimed to evaluate the performance of ChatGPT 4, ChatGPT 3.5, and Google Gemini in addressing common patient questions about SCI. A systematic process was used to identify 10 key patient questions related to SCI from online sources, PubMed, and Google Trends. These questions were submitted to ChatGPT 4, ChatGPT 3.5, and Google Gemini using a standardized prompt and a 150-word response cap to elicit expert-like responses. Eight blinded spine surgeons evaluated the chatbot-generated answers for quality, clarity, empathy, and comprehensiveness using a validated rating system. Responses were categorized as “excellent,” “satisfactory with minimal clarification,” “satisfactory with moderate clarification,” or “unsatisfactory.” Across all three models, the majority of responses were rated as either excellent or requiring only minimal clarification. ChatGPT 4 achieved the highest proportion of high-quality responses, with up to almost 90% rated as “excellent” or “minimal clarification required.” ChatGPT 3.5 and Google Gemini performed similarly, with slightly lower percentages of high-quality responses. No statistically significant differences were observed between the models in overall performance. In a standardized single turn, 150-word setting, publicly available LLMs produced largely satisfactory answers to common SCI questions with comparable performance across models. LLMs can be recommended as adjuncts for general patient education, while their outputs should be reviewed within clinical care. Further studies should test multi turn interactions, include patient and multidisciplinary evaluators, compare chatbot responses with clinician authored answers and evaluate the performance of domain specific medical LLMs. II.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.
Autoren
Institutionen
- University of Regensburg(DE)
- Karl Landsteiner University of Health Sciences(AT)
- University Hospital Regensburg(DE)
- Charité - Universitätsmedizin Berlin(DE)
- University Hospital of Bern(CH)
- Swiss Paraplegic Center(CH)
- Berufsgenossenschaftliche Unfallklinik Murnau(DE)
- Berufsgenossenschaftliche Unfallklinik Frankfurt am Main(DE)
- OTH Regensburg(DE)