Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can Large Language Models Generate High-Quality Short-Answer Assessments? A Comparative Study in Undergraduate Medical Education
0
Zitationen
8
Autoren
2026
Jahr
Abstract
Background: Generative artificial intelligence (AI) tools including ChatGPT have the potential to augment the process of designing examinations and assessments for medical learners, leading to time and resource savings, and the ability to produce large volumes of practice problems tailored to learner-specific strengths and weaknesses. Methods: This study compares the quality of free-text assessment problems and answer keys generated by ChatGPT to those produced by faculty educators for a renal and hematology curriculum subunit. Five expert reviewers reviewed a collection of 21 free-text assessment problems, 9 from a collection of historical assessment problems used in an undergraduate medical program and 12 produced with ChatGPT. Reviewers assigned a score from 1 to 5, reflecting the overall quality. Results: The average quality of problems generated by ChatGPT was greater than that of human-generated problems (4.00 vs. 2.71, p < 0.001). Using ordinal mixed-effect modeling, human-generated problems had significantly lower odds of receiving higher ratings than ChatGPT-generated problems (β = −2.43, 95% confidence interval −3.34 to −1.51, p < 0.001). Conclusions: It is suggested that ChatGPT can assist expert faculty educators in producing assessment tools, with direct benefits to medical learners, although it cannot entirely replace this role in its current state.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.349 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.219 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.631 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.480 Zit.