Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluation of AI-Generated Multiple-Choice Questions for Periodontology Exams: A Quality Assessment Study
0
Zitationen
6
Autoren
2026
Jahr
Abstract
<title>Abstract</title> <bold>Background:</bold> This study evaluated the quality of multiple-choice questions (MCQs) generated by ChatGPT-4o compared with faculty written items in periodontology using the Integrated National Board Dental Examination (INBDE) rubric. <bold>Methods:</bold> Thirty MCQs were assessed in a blinded cross-sectional comparison at Tufts University School of Dental Medicine. 15 questions were generated by ChatGPT-4o based on course objectives and INBDE guidelines, and 15 were randomly selected from the departmental exam bank. Fourteen periodontology faculty members rated each item on six INBDE criteria including clarity, content accuracy, distractor quality, fairness, curricular alignment, and grammar using a five-point Likert scale ranging from poor (1) to excellent (5). Composite scores were analyzed using a generalized linear mixed model. <bold>Results:</bold> AI-generated items achieved significantly higher composite scores than human-written questions (20.7 ± 4.9 vs 18.3 ± 5.1; p < 0.001). In descriptive comparisons, AI items also received higher ratings across all six domains, particularly in clarity and grammar. Reviewers were unable to reliably identify the source of the items, and 84.1% of AI generated questions were judged suitable for exam use compared with 55.7% of faculty written items. <bold>Conclusions:</bold> ChatGPT-4o produced high-quality and well-structured MCQs, and reviewers frequently reported difficulty distinguishing their origin in this blinded assessment. While these results highlight the potential value of AI-assisted assessment design, expert supervision remains essential to ensure accuracy, cognitive depth, and alignment with educational standards. AI should be a supportive tool that complements rather than replaces faculty expertise in question development.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.287 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.140 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.534 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.450 Zit.