OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 08.05.2026, 15:05

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Can large language models reliably educate patients after kyphoplasty? A clinician-rated comparative study of ChatGPT and Gemini

2026·0 Zitationen·Journal of Orthopaedic ReportsOpen Access
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2026

Jahr

Abstract

Large language models, such as ChatGPT and Google Gemini, are becoming increasingly used in medicine for various purposes, ranging from medical education to research. Given the accessibility of consumer-facing models, patients may turn to them for answers to their medical questions. To compare outputs from ChatGPT and Google Gemini in response to common post-operative questions from patients after kyphoplasty. Thirteen common post-operative questions were compiled and asked to ChatGPT and Gemini. Five clinicians assessed the clinical accuracy and appropriateness of the responses using a 5-point Likert scale. Reviewers were blinded to model identity. Readability was evaluated by three raters using the Flesch-Kincaid grade level and a 3-point Likert scale. Matched-pair t-tests were used to compare responses from ChatGPT and Google Gemini, with statistical significance defined as a p-value < 0.05. ChatGPT responses were more accurate (p<0.001) and appropriate (p<0.01) compared to Gemini. ChatGPT's average Flesch-Kincaid grade level was 12.2, compared to 13.0 for Gemini (p = 0.05). On the 3-point Likert scale for readability, ChatGPT scored an average of 1.56/2, while Gemini scored 1.85/2 (p = 0.01). ChatGPT outperformed Gemini in terms of clinical accuracy and the appropriateness of responses. The results for readability were mixed, with the Flesch-Kincaid system indicating that ChatGPT generated responses at a higher grade level, while the Likert scale showed that Gemini’s responses were easier to read. While ChatGPT demonstrated better clinical accuracy and appropriateness, the use of LLM should not replace clinician-delivered postoperative counseling.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationTraumatic Brain Injury and Neurovascular DisturbancesMachine Learning in Healthcare
Volltext beim Verlag öffnen