OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 00:38

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating AI-Generated Virtual Patients for Communication Training in Medical Education: A Simulation Study (Preprint)

2026·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2026

Jahr

Abstract

<sec> <title>BACKGROUND</title> Simulation-based medical training often relies on scripted or pre-recorded dialogue, limiting opportunities to practice adaptive patient communication. Large language models (LLMs) can generate flexible patient dialogue, but it remains unclear whether personality-conditioned outputs produce perceptible personality cues, support clinically realistic communication, or influence judgments about whether dialogue appears AI-generated in educational contexts. </sec> <sec> <title>OBJECTIVE</title> This study examined whether AI-generated virtual patient dialogue conditioned on Big Five personality profiles produces perceptible and design-relevant differences in simulated medical communication, assessing personality inference, the relationship between perceived personality alignment and clinical realism, the communicative cues underlying these judgments, and whether LLMs can approximate human evaluations in a dental consultation scenario. </sec> <sec> <title>METHODS</title> We developed PRISM, a prototype platform featuring an LLM-powered virtual patient with configurable Big Five personality profiles. In an exploratory study, adult participants reviewed five pre-generated dental consultation transcripts in which the dentist’s dialogue was standardized using an existing educational script, while patient communication style varied by assigned personality profile. Participants evaluated personality alignment, perceived clinical realism, and whether each transcript appeared AI- or human-authored. Open-ended explanations were analyzed qualitatively, and an exploratory LLM-as-rater analysis compared automated and human judgments. </sec> <sec> <title>RESULTS</title> Participants provided 75 transcript-level judgments across five simulated dental consultations. Accuracy in inferring intended Big Five personality traits was low (mean 27.7%, SD 8.1%), corresponding to a mean of 6.9 (SD 2.0) correct items out of 25, with ratings clustering toward moderate trait levels. Mixed-effects models showed that inferred personality ratings did not reliably track assigned trait values. Higher inference accuracy was associated with greater perceived alignment with the assigned profile (β=1.37, SE 0.42, z=3.30, P&lt;.001; OR 3.94, 95% CI 1.75–8.88), but not with perceived clinical realism (β=−0.38, SE 0.42, z=−0.89, P=.37; OR 0.69, 95% CI 0.30–1.57). Dialogues were classified as AI-generated in 41% of judgments, and higher realism ratings were associated with lower odds of AI attribution (β=−0.87, SE 0.33, z=−2.64, P=.008; OR 0.42, 95% CI 0.21–0.78). In exploratory analyses, LLM raters achieved 44% and 48% accuracy in personality inference but classified all transcripts as human-authored. </sec> <sec> <title>CONCLUSIONS</title> Participants relied primarily on language style, content specificity, and behavioral coherence when evaluating personality alignment, clinical plausibility, and dialogue authorship, while emotional cues played a limited role. Personality-conditioned dialogue influenced overall conversational impressions but did not support reliable recognition of individual Big Five traits, indicating constraints in trait-level personality control for communication-focused simulations. Although LLMs generated coherent and contextually appropriate patient dialogue, automated evaluations overestimated realism and misattributed authorship. Effective educational deployment of AI-generated virtual patients therefore requires human-centered validation, clear instructional framing, and design strategies that prioritize interactional coherence and contextual realism over latent personality parameterization. </sec>

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationSimulation-Based Education in HealthcareMachine Learning in Healthcare
Volltext beim Verlag öffnen