OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 28.03.2026, 16:02

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Explainable and Interpretable AI for Voice and Speech Analysis in Clinical Care: A Systematic Review (Preprint)

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2025

Jahr

Abstract

<sec> <title>BACKGROUND</title> Driven by recent advances in artificial intelligence, particularly in medicine, audio-based voice and speech biomarkers are increasingly investigated for various medical applications as a complementary or even alternative modality to traditional medical devices. The adoption of deep learning techniques in recent literature is motivated by their superior performance compared to classical machine learning (ML) methods. However, ethical and regulatory concerns regarding the black-box nature of these models have limited their integration into clinical workflows. Consequently, Explainable AI (XAI) has recently been employed to address this issue by generating explanations for opaque model output. Ideally, medical XAI systems aim to provide human-understandable, clinically grounded explanations essential for enhanced AI trustworthiness and, thereby, facilitated adoption into real-world clinical settings. </sec> <sec> <title>OBJECTIVE</title> We conduct a systematic literature review of XAI methods applied for explaining deep learning techniques in audio-based voice and speech clinical applications. We present a taxonomy of XAI methods in the literature and discuss the limitations of these methods, particularly for their application to clinical audio, evaluation of XAI outputs, and stakeholder relevance of generated explanation. Then, we identify opportunities and recommendations for future clinical audio XAI design. </sec> <sec> <title>METHODS</title> This review follows the Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Six databases (IEEEXplore, ACM, Scopus, PubMed, Web of Science, and Nature) were searched for articles between January 2015 and February 2025. Included studies applied explainability and/or interpretability methods to deep learning techniques for clinical voice and speech audio. </sec> <sec> <title>RESULTS</title> A taxonomy of XAI methods is presented for 30 eligible studies. These methods are grouped into four categories: visualization-based techniques, feature-importance and attribution methods, attention-based explanations, and concept detectors and model intrinsic approaches. We find that current XAI methods and implementations lack rigorous evaluation and validation, are not suitable for the unique nature of clinical audio, and do not align with stakeholder expectations and needs. </sec> <sec> <title>CONCLUSIONS</title> This survey presents a categorization of XAI techniques employed for voice and speech AI. We discuss several gaps and considerations and identify several opportunities for future clinical audio XAI design. </sec>

Ähnliche Arbeiten