Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Evaluation of ChatGPT and Microsoft Copilot in Solving Clinical Vignette- style multiple-choice questions (MCQs) in Physiology
0
Zitationen
3
Autoren
2026
Jahr
Abstract
INTRODUCTION: Large language models (LLMs) are increasingly used by MBBS students as supplementary resources for exam preparation. The objective of this study was to evaluate the performance of ChatGPT and Microsoft Copilot in answering clinical vignette-style physiology MCQs from widely used resources for the United States Medical Licensing Examination (USMLE). MATERIALS AND METHODS: Fifty clinical vignette-style physiology multiple choice questions (MCQs) from the various USMLE question banks were submitted to ChatGPT and Microsoft Copilot to choose the correct option. The performance of ChatGPT and Microsoft Copilot was assessed using the provided answers in the question bank. Two experienced physiologists independently reviewed the explanations provided by ChatGPT and Microsoft Copilot for each MCQ. The explanations were rated between one to three points based on whether the answers were completely incorrect, partially correct with inaccurate information, or correct with adequate information. RESULTS: ChatGPT and Microsoft Copilot both correctly answered 48 and 47 out of 50 questions, reflecting a 96% and 94% accuracy rates respectively. One MCQ each on hypothyroidism and arrhythmia was incorrectly answered by both ChatGPT and Microsoft Copilot. For two MCQs, the explanations provided were inaccurate by ChatGPT and Microsoft Copilot provided inaccurate explanations for four of the MCQs. CONCLUSION: ChatGPT and Microsoft Copilot both demonstrated more than 90% accuracy in answering case-based MCQs from the USMLE Step 1 resources. Their incorrect option choices MCQs on hypothyroidism and inaccurate explanations for some MCQs highlight cautious use of AI by students.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.