Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatGPT, Bard, and Bing Chat Are Large Language Processing Models That Answered Orthopaedic In‐Training Examination Questions With Similar Accuracy to First‐Year Orthopaedic Surgery Residents
13
Zitationen
8
Autoren
2024
Jahr
Abstract
PURPOSE: To assess ChatGPT's, Bard's, and Bing Chat's ability to generate accurate orthopaedic diagnoses or corresponding treatments by comparing their performance on the Orthopaedic In-Training Examination (OITE) with that of orthopaedic trainees. METHODS: OITE question sets from 2021 and 2022 were compiled to form a large set of 420 questions. ChatGPT (GPT-3.5), Bard, and Bing Chat were instructed to select one of the provided responses to each question. The accuracy of composite questions was recorded and comparatively analyzed to human cohorts including medical students and orthopaedic residents, stratified by postgraduate year (PGY). RESULTS: ChatGPT correctly answered 46.3% of composite questions whereas Bing Chat correctly answered 52.4% of questions and Bard correctly answered 51.4% of questions on the OITE. When image-associated questions were excluded, ChatGPT's, Bing Chat's, and Bard's overall accuracies improved to 49.1%, 53.5%, and 56.8%, respectively. Medical students correctly answered 30.8%, and PGY-1, -2, -3, -4, and -5 orthopaedic residents correctly answered 53.1%, 60.4%, 66.6%, 70.0%, and 71.9%, respectively. CONCLUSIONS: ChatGPT, Bard, and Bing Chat are artificial intelligence (AI) models that answered OITE questions with accuracy similar to that of first-year orthopaedic surgery residents. ChatGPT, Bard, and Bing Chat achieved this result without using images or other supplementary media that human test takers are provided. CLINICAL RELEVANCE: Our comparative performance analysis of AI models on orthopaedic board-style questions highlights ChatGPT's, Bing Chat's, and Bard's clinical knowledge and proficiency. Our analysis establishes a baseline of AI model proficiency in the field of orthopaedics and provides a comparative marker for future, more advanced deep learning models. Although in its elementary phase, future AI models' orthopaedic knowledge may provide clinical support and serve as an educational tool.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.553 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.444 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.943 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.792 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.