Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Same logs, different voices: AI-generated vs human feedback during clinical clerkship in undergraduate education (Preprint)
0
Zitationen
5
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Feedback is essential for medical students' learning during clinical clerkships, yet supervising physicians often struggle to provide meaningful written feedback due to time constraints. Large language models (LLMs) offer a promising approach to supplement human feedback, but how AI-generated and human feedback differ in authentic clinical settings remains unclear. Previous studies have yielded inconsistent findings regarding feedback length, quality, and distinguishability, with most comparisons conducted in classroom or simulation contexts rather than clinical environments. </sec> <sec> <title>OBJECTIVE</title> To examine how AI-generated feedback and supervisor-provided feedback differ when applied to medical students' clinical clerkship logs, by identifying the distinct characteristics and complementary strengths of each feedback type. </sec> <sec> <title>METHODS</title> This mixed-methods study employed a convergent design. We collected 161 sets of weekly clinical clerkship logs from fifth- and sixth-year medical students at Nagoya University, Japan, along with corresponding supervisor feedback and AI-generated feedback using GPT-4o. Ten faculty members and ten medical students evaluated both feedback types using a validated rubric assessing five categories: criteria-based, clear directions for improvement, accuracy, prioritization, and supportive tone. Quantitative analyses included paired t-tests, cumulative link mixed models, and correlation analyses. Qualitative thematic analysis examined evaluators' open-ended comments. Results were integrated using Joint Display Analysis. </sec> <sec> <title>RESULTS</title> AI feedback was significantly longer than supervisor feedback (mean: 382 vs. 98 characters, p <0.001). AI feedback scored significantly higher on criteria-based (OR=11.81, p<0.001) and clear direction (OR=6.61, p<0.001) categories, with no significant differences in accuracy, prioritization, or supportive tone. AI feedback demonstrated greater quality consistency, while supervisor feedback showed higher variability (variance ratio 3.9:1). For supervisor feedback, length positively correlated with quality scores; no such correlation existed for AI feedback. All evaluators correctly identified feedback sources. Qualitative analysis of open-ended comments revealed five themes: adherence to feedback criteria and structure, continuity and consistency, perspective as a clinician, quality of Japanese language, and text length. AI provided structured, text-anchored feedback following rubric criteria, while supervisors offered experience-based feedback grounded in clinical context and professional expertise that sometimes lacked structured elements. </sec> <sec> <title>CONCLUSIONS</title> AI-generated and supervisor-provided feedback show distinct but complementary strengths. AI consistently delivers structured, criterion-based feedback aligned with written content, addressing gaps that may arise when time-pressured supervisors provide brief feedback. However, AI lacks the clinical perspective and contextual grounding that supervisors bring from direct observation and professional experience. These findings suggest that AI feedback should complement rather than replace human feedback in clinical clerkship settings, with each type addressing the other's limitations to optimize student learning. </sec>
Ähnliche Arbeiten
Making sense of Cronbach's alpha
2011 · 13.781 Zit.
Technology-Enhanced Simulation for Health Professions Education
2011 · 1.941 Zit.
The future vision of simulation in health care
2004 · 1.856 Zit.
Does Simulation-Based Medical Education With Deliberate Practice Yield Better Results Than Traditional Clinical Education? A Meta-Analytic Comparative Review of the Evidence
2011 · 1.710 Zit.
A critical review of simulation‐based medical education research: 2003–2009
2009 · 1.659 Zit.