Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
AI-Generated Clinical Summaries: Errors and Susceptibility to Speech and Speaker Variability
0
Zitationen
9
Autoren
2025
Jahr
Abstract
Abstract Summary Box What is already known on this topic Clinical AI Scribe outputs can contain errors, and the impact of human factors (e.g. communication style, accents, speech impairments) in clinical contexts remains under-characterised. What this study adds In controlled simulations, patient personality and accent did not significantly alter total CAIS errors, with omissions predominating and hallucinations/inaccuracies remaining low. Speech-impairment effects were highly varied, with near-perfect recognition for cleft palate and vowel disorders, whereas phonological impairment substantially reduced accuracy. How this study might affect research, practice or policy Supports clinician-in-the-loop deployment with local validation across representative accents and impairment profiles, prioritising detection of clinically critical errors. Routine governance should include subgroup performance reporting (accents, impairments) and ongoing audit of error rates. Objective The study aims to evaluate whether variability in patients’ communication style (personality, international English accents, and speech impairments) affects the accuracy of a Clinical AI Scribe (CAIS), and to identify where performance degrades. Method and Analysis We conducted simulated primary-care consultations in a purpose-built lab using trained actors. To investigate personality types, four scenarios were enacted, each with five patient-personality types. For accents, human-verified transcripts of consultations were used to generate all doctor/patient combinations of seven different accents (including a synthetic reference voice) across five scenarios. The CAIS produced SOAP-structured summaries that were compared with the transcripts. Errors were classified as omissions, factual inaccuracies, or hallucinations. For speech impairments, public recordings representing five profiles were transcribed and word-recognition accuracy was calculated. Results Personality types showed no statistically significant differences in errors (all p >0.05). Extraversion had the highest total errors (median 3.5), while conscientiousness and agreeableness were lower (1.5 and 2.0, respectively). Across accents, both pairwise tests and group comparisons were non-significant for both patient and doctor voices (patients: p =0.851; doctors: p =0.98). Omissions predominated, with low rates of hallucinations and factual inaccuracies. Omissions were slightly higher for Chinese- and Indian-accented doctors (both medians 3.0). In contrast, speech impairments differed: cleft palate and vowel disorders were near-perfect, whereas phonological impairment markedly reduced recognition ( p <0.001). Conclusions Under controlled conditions, CAIS performance was broadly stable across communication styles and most accents but remained vulnerable to specific speech characteristics, particularly phonological impairment. Future evaluations using real-world, multi-speaker clinical audio are needed to confirm performance.
Ähnliche Arbeiten
Concurrent Chemotherapy and Radiotherapy for Organ Preservation in Advanced Laryngeal Cancer
2003 · 3.150 Zit.
The Voice Handicap Index (VHI)
1997 · 2.475 Zit.
Reliability and Factor Analysis of the Epworth Sleepiness Scale
1992 · 2.118 Zit.
Motor Speech Disorders: Substrates, Differential Diagnosis, and Management
1995 · 1.900 Zit.
Incidence of Parkinson's Disease: Variation by Age, Gender, and Race/Ethnicity
2003 · 1.727 Zit.