Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluation of ChatGPT-4o as a Patient Information Tool for Common Orthopaedic Surgeries: Accuracy, Completeness, and Clinical Utility

2025·0 Zitationen·JAAOS Global Research and ReviewsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

INTRODUCTION: Artificial intelligence chatbots, such as ChatGPT-4o ("omni"), a large language model developed by OpenAI that integrates text, image, and audio processing with web connectivity, have gained traction as potential patient education tools in orthopaedic surgery. This study aimed to evaluate the accuracy, completeness, and clinical utility of ChatGPT-4o's responses to common patient questions about six widely performed orthopaedic procedures. METHODS: We assessed ChatGPT-4o's responses to five standardized patient-oriented queries for total knee arthroplasty, total hip arthroplasty, anterior cruciate ligament reconstruction, rotator cuff repair, anterior cervical diskectomy and fusion, and carpal tunnel release. Responses were generated using ChatGPT-4o's web-enabled version in January 2025. Two resident orthopaedic surgeons independently rated each response for accuracy, completeness, layperson clarity, misleading content, and conciseness using a structured binary rubric. The validated DISCERN instrument (16 items, max score 80) was adapted for quantitative assessment of information quality. Interrater reliability was assessed with Cohen kappa. RESULTS: Overall, ChatGPT-4o generated accurate and structured responses, free of overt errors. The average DISCERN score across procedures was 43.5, classifying the information as fair. The highest average DISCERN score was for anterior cervical diskectomy and fusion (mean 45.8 ± 10.1), whereas the lowest was for rotator cuff repair (mean 41.6 ± 5.9). Factual accuracy was high (>90%), but 36% of responses contained some misleading or incomplete information. Responses explaining treatment alternatives were the most accurate and complete, whereas those outlining surgical risks performed worst. Interrater agreement was good (Cohen kappa = 0.64). DISCUSSION: ChatGPT-4o provided generally accurate, clear, and empathetic explanations of common orthopaedic surgeries, offering a promising adjunct to conventional patient education. However, key limitations particularly regarding alternative treatments, nuanced risks, and lack of tailored advice limit its stand-alone use in clinical practice. Careful oversight and clinician vetting remain essential. CONCLUSIONS: ChatGPT-4o can supplement orthopaedic patient education by offering accessible, engaging content. However, notablenotable gaps in detail and occasional misleading information necessitate careful review and contextual explanation by orthopaedic surgeons.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsHealthcare cost, quality, practices

Volltext beim Verlag öffnen

Evaluation of ChatGPT-4o as a Patient Information Tool for Common Orthopaedic Surgeries: Accuracy, Completeness, and Clinical Utility

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen