Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Analyzing the Accuracy of Large Language Models in United States Medical Licensing Exam Social-Science Preparation Question Banks

2025·0 Zitationen·Cureus Journal of Computer Science.Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Introduction: Artificial intelligence (AI) emergence has changed the medical education landscape. The United States Medical Licensing Exams increasingly include questions in the “Social Sciences (Ethics/Legal/Professional)” domain, which often require reasoning through complex scenarios. This study evaluates three AI platforms in answering such preparation questions. Methods: Social-science questions from UWorld and Amboss were accumulated for Steps 1 and 2. Multiple-choice questions were entered into ChatGPT, Gemini, and Perplexity, yielding a correct/incorrect response. Percentages of correct responses were compared between platforms and to student averages. Analysis of variance and t-tests conducted determined statistical significance, the upper threshold being P = 0.05. As this study focused on quantitative performance, outcomes were limited to accuracy on multiple-choice items, rather than direct measures of empathetic reasoning. Results: One hundred nine UWorld and 63 Amboss questions were available for Step 1. A total of 189 UWorld and 185 Amboss questions were available for Step 2. Google Gemini had the highest accuracy for Step 1 (86.6%), while ChatGPT had the highest accuracy for Step 2 (83.6%). All platforms outperformed the student average for Step 1 (68.5%) and Step 2 (68.9%), with ChatGPT and Gemini doing so significantly for Step 1 (p < 0.01), and ChatGPT doing so significantly for Step 2 (p < 0.01). Conclusion: It is critical to understand whether empathetic thinking can be replicated by technology or prepare students. This study highlights the ability for AI to solve ethical dilemmas that students may struggle with.

Autoren

Themen

Artificial Intelligence in Healthcare and EducationRadiology practices and educationAcademic integrity and plagiarism

Volltext beim Verlag öffnen

Analyzing the Accuracy of Large Language Models in United States Medical Licensing Exam Social-Science Preparation Question Banks

Abstract

Ähnliche Arbeiten

Autoren

Themen