OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 17.03.2026, 19:35

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Accuracy of ChatGPT for literature citations in lower limb arthroplast

2025·1 Zitationen·Journal of Orthopaedic ReportsOpen Access
Volltext beim Verlag öffnen

1

Zitationen

6

Autoren

2025

Jahr

Abstract

Artificial Intelligence (AI) enables orthopaedic surgeons to analyze large datasets efficiently. Generative chatbots like ChatGPT utilize open-access sources to provide focused, modifiable results, potentially serving as valuable research aids. However, concerns remain regarding the legitimacy of generated citations, with prior studies reporting high “hallucination” rates. This study assessed the accuracy of citations generated for lower limb arthroplasty and evaluated the impact of prompt specificity on ChatGPT v3.5 and v4.0 performance. In August 2024, ChatGPT v3.5 and v4.0 were queried using three levels of prompt specificity (simple, medium, and complex) to generate outlines with 10 citations for primary total knee arthroplasty (TKA), primary total hip arthroplasty (THA), revision TKA, and revision THA. Prompts included elements like citation style, source origin, and contextual indicators. Generated citations were verified against PubMed, Google Scholar, and Google, and classified as Nonexistent, Secondary Work, Improperly Cited, or Properly Cited. Chi-square and multivariate logistic regression identified differences and common errors. Among 240 citations (ChatGPT-3.5: n=120; ChatGPT-4.0: n=120), increasing prompt specificity significantly improved citation accuracy ( p < 0.001). Simple prompts produced no legitimate primary sources, while medium and complex prompts generated similar rates (66.3% vs. 68.8%) but reduced errors (39.6% vs. 21.8%). Recent publications increased errors in publication year, journal, and authorship (OR: 1.2–1.206), while high-impact journals reduced journal errors (OR: 0.581). ChatGPT v3.5 and v4.0 show improved performance with tailored prompts, generating legitimate citations at twice the rate of hallucinations. While these models aid scientific writing, citation errors and hallucinations remain significant limitations.

Ähnliche Arbeiten