OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 30.03.2026, 01:07

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Teaching Clinical Reasoning in the Age of AI: A Mixed-Methods Formative Evaluation of AI-Generated Script Concordance Tests and Expert Embodiment (Preprint)

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2025

Jahr

Abstract

<sec> <title>BACKGROUND</title> The integration of artificial intelligence (AI) in medical education is evolving, offering new tools to enhance teaching and assessment. Among these, script concordance tests (SCT) are well suited to evaluate clinical reasoning in contexts of uncertainty. Traditionally, SCTs require expert panels for scoring and feedback, which can be resource intensive. Recent advances in generative AI, particularly large language models (LLM), suggest the possibility of replacing human experts with simulated ones, though this potential remains underexplored. </sec> <sec> <title>OBJECTIVE</title> This study aimed to evaluate whether LLMs can effectively simulate expert judgment in SCTs, by using generative AI to author, score, and provide feedback for SCTs in cardiology and pneumology. A secondary goal was to assess students’ perceptions of the test’s difficulty and the pedagogical value of AI-generated feedback. </sec> <sec> <title>METHODS</title> A cross-sectional, mixed-methods study was conducted with 25 second-year medical students who completed a 32-item SCT authored by ChatGPT-4o. Six LLMs (three trained on course material and three untrained) served as simulated experts to generate scoring keys and feedback. Students answered SCT questions, rated perceived difficulty, and selected the most helpful feedback explanation for each item. Quantitative analysis included scoring, difficulty ratings, and correlation between student and AI responses. Qualitative comments were thematically analyzed. </sec> <sec> <title>RESULTS</title> The average student score was 22.8 out of 32 (SD = 1.6), with scores ranging from 19.75 to 26.75. Trained AI systems showed significantly higher concordance with student responses (ρ = 0.64) than untrained models (ρ = 0.41). AI-generated feedback was rated as most helpful in 62.5% of cases, especially when provided by trained models. The SCT demonstrated good internal consistency (Cronbach’s α = 0.76), and students reported moderate perceived difficulty (mean=3.7/7). Qualitative feedback highlighted appreciation for SCTs as reflective tools, while recommending clearer guidance on Likert-scale use and more contextual detail in vignettes. </sec> <sec> <title>CONCLUSIONS</title> This is among the first studies to demonstrate that trained generative AI models can reliably simulate expert clinical reasoning in a script concordance framework. The findings suggest that AI can both streamline SCT design and offer educational valuable feedback without compromising authenticity. Future studies should explore longitudinal effects on learning and assess how hybrid models (human and AI) can optimize reasoning instruction in medical education. </sec> <sec> <title>CLINICALTRIAL</title> <p/> </sec>

Ähnliche Arbeiten

Autoren

Themen

Clinical Reasoning and Diagnostic SkillsArtificial Intelligence in Healthcare and EducationInnovations in Medical Education
Volltext beim Verlag öffnen