Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Artificial Intelligence Generated Questions in Medical Education: How Prompt Design in Different Chatbots Shapes Assessment in Obstetrics and Gynecology?
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Objective:The aim of this study is to assess the difficulty level of artificial intelligence (AI)-generated multiple-choice questions (MCQs) created by large language models (LLMs) using different prompts across various chatbots, compared to human-written questions.Methods: We generated case-based MCQs on obstetrics and gynecology using two distinct prompts across four LLM-based chatbots.Expert-reviewed MCQs were administered to 97 medical students who were undergoing clerkship training in obstetrics and gynecology.Subsequently, item difficulty indices were calculated for each MCQ.Results: The mean difficulty index of the AI-generated questions was 0.30.One prompt produced questions with a difficulty index of 0.34 (classified as difficult), while the other produced a lower difficulty index of 0.25 (classified as more difficult).In contrast, the mean difficulty index of the humanwritten questions was 0.63, indicating a moderate level of difficulty. Conclusion:Our study highlights the challenges of using AI-generated MCQs in medical education.Although AI offers promising benefits for question generation, the questions produced were generally too difficult for undergraduate medical students.This underscores the need for more detailed and contextually informed prompt designs to better align AI outputs with assessment requirements.Although BDM-based chatbots enhance efficiency in question generation, expert review remains essential to ensure the appropriateness and quality of the items.
Ähnliche Arbeiten
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller
1999 · 5.632 Zit.
An experiment in linguistic synthesis with a fuzzy logic controller
1975 · 5.563 Zit.
A FRAMEWORK FOR REPRESENTING KNOWLEDGE
1988 · 4.548 Zit.
Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
2023 · 3.360 Zit.