OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.03.2026, 21:07

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

GPT-4 versus human authors in clinically complex MCQ creation: a blinded analysis of item quality

2024·4 ZitationenOpen Access
Volltext beim Verlag öffnen

4

Zitationen

6

Autoren

2024

Jahr

Abstract

<title>Abstract</title> MCQs are a popular assessment format in medical education. Creating clinically complex MCQs can be a time-consuming task for subject matter experts. Large language models such as GPT-4, a type of generative artificial intelligence (AI), are a potential tool for MCQ design. Clinically complex human-generated MCQs, at both novice and expert level, were compared with AI MCQs. A generic prompt for GPT-4 was engineered, which included item-writing guidance, example MCQs, and key learning points. A standardised scoring system was developed for a consensus panel to objectively evaluate each item, blinded to the author, on categories including content validity, scope, item anatomy, cognitive skill level, item-writing flaws (IWFs), feedback comprehensiveness, veracity, adequacy of clinical reasoning, and global impression of fitness for use. Analysis showed that all groups (novice, expert, and AI) were able generate items within scope. Expert items performed better than Novice items in all categories. Expert items performed better than AI in content validity, feedback veracity and clinical reasoning. They also tended to test higher order cognitive skills. There was no difference in the global impressions of Expert and AI items, which suggests they may be comparable overall. With adequate prompt engineering, GPT-4 can produce MCQs testing clinically complex concepts for medical assessment. The quality of AI outputs is comparable to experts, however human validation is necessary to ensure content validity. The AI-generated explanatory feedback was adequate in veracity and clinical reasoning, which may serve as an educational tool for learners.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic Skills
Volltext beim Verlag öffnen