Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
“Walk a Mile in My Voice”: Voice Conversion Shapes Trust, Attribution, and Empathy in Human-AI Speech Interactions
0
Zitationen
6
Autoren
2026
Jahr
Abstract
Speech Large Language Models (SpeechLLMs) represent a new generation of conversational AI that processes spoken language directly from audio. This enables sensitivity to prosodic cues while also inheriting voice-based demographic information that has been shown to lead to biased system behaviour. Studying how people react and reflect on AI responses to different gender and accent presentation can contribute to understanding the potential societal impact. In this study, we examine how vocal identity factors of accent and perceived gender shape user evaluations of AI responses while the underlying linguistic content remains constant. Through two complementary studies (Interactive Study, N=24; Observational Study, N=19), we investigate whether experiencing interactions through voice converted identities versus observing pre-recorded conversations affects perceived harm, acceptability, trust, and responsibility attribution. We find that participants who experienced voice conversion rated benign AI responses as significantly more acceptable and reported significantly higher trust compared to those observing identical interactions, while perceived harm remained low across conditions. Qualitative feedback reveals that participants attributed different AI behaviours to voice characteristics, noting perceived differences in tone, helpfulness, and respect based on accent and gender presentation. Our findings suggest that vocal identity functions as a design variable, with systematic effects on user perception even when lexical content is held constant.
Ähnliche Arbeiten
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller
1999 · 5.632 Zit.
An experiment in linguistic synthesis with a fuzzy logic controller
1975 · 5.568 Zit.
A FRAMEWORK FOR REPRESENTING KNOWLEDGE
1988 · 4.551 Zit.
Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
2023 · 3.395 Zit.