Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Authors’ Reply: Critical Limitations in Comparing ChatGPT and DeepSeek for Orthopedic Assessment
0
Zitationen
4
Autoren
2026
Jahr
Abstract
We respond to comments on our study comparing ChatGPT and DeepSeek for answering orthopedic multiple-choice questions. We clarify that the reported Cohen κ values reflect inter-rater reliability within each model rather than agreement between the two models. All questions were administered in English, and the findings therefore reflect performance in an English-language context. We acknowledge limitations related to reproducibility due to the use of web-based interfaces and address concerns about data contamination. We also correct a typographical error in the reported accuracy for the pelvic and spine injury category.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.339 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.211 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.614 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.478 Zit.