OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.03.2026, 08:06

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Systematic Review on Large Language Models in Orthopaedic Surgery

2025·2 Zitationen·Journal of Clinical MedicineOpen Access
Volltext beim Verlag öffnen

2

Zitationen

9

Autoren

2025

Jahr

Abstract

<b>Background/Objectives</b>: Since ChatGPT was released in 2022, many Large Language Models (LLM) have been developed, showing potential to expand the field of orthopaedic surgery. This is the first systematic review looking at the current state of research of LLMs in orthopaedic surgery. The aim of this study is to identify which LLMs are researched, assess their functionalities, and evaluate their quality of results. <b>Methods</b>: The systematic review was conducted using PubMed, Embase, and Cochrane Library databases in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. <b>Results</b>: A total of 60 studies were included in the final review, all of which included ChatGPT versions 3.0 or 4.0. There were five studies that included Bard and one article each for Perplexity AI and Bing. Most studies assessed orthopaedic assessment questions (23 studies) and their ability to correctly answer free ended questions (31 studies). Outcome measures used to assess the accuracy of LLMs in most of the included studies were the percentage of correct answers on multiple-choice questions or expert-graded consensus to open-ended responses. The accuracy of ChatGPT 4.0 in orthopaedic assessment questions ranged from 47.2 to 73.6% without images, and 35.7-65.85% with images. The accuracy of ChatGPT 3.5 was 29.4-55.8% without images and 22.4-46.34% with images. The accuracy of Bard ranged from 49.8 to 58%. Orthopaedic residents consistently scored better than LLMs in the range of 74.2-75.3%. <b>Conclusions</b>: ChatGPT 4 showed significant improvement over ChatGPT 3.5 in answering orthopaedic assessment questions. When comparing performances of orthopaedic residents to LLMs, orthopaedic residents scored higher overall. There remains significant opportunity for development of LLM performance on orthopaedic assessments as well as image-based analysis and clinical documentation.

Ähnliche Arbeiten