OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 24.03.2026, 11:09

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

P04.10.A ARE FOUNDATIONAL AI SPEECH MODELS READY FOR NEURO-ONCOLOGY? THE BRAINAPP STUDY SAYS “NOT YET”

2025·0 Zitationen·Neuro-OncologyOpen Access
Volltext beim Verlag öffnen

0

Zitationen

3

Autoren

2025

Jahr

Abstract

Abstract BACKGROUND Primary brain tumours (BTs) can cause diverse neurological deficits, including speech impairments. An objective and rapid speech assessment method is crucial for effective patient monitoring, support, and rehabilitation. However, current methods to assess speech in BT patients can be laborious and imprecise. Automated assessment systems could overcome these limitations—but they have yet to be evaluated on BT speech. This is hampered by the lack of public speech datasets (corpora) and well-documented BT-related speech impairments. MATERIALS & METHODS To address this need, we curated the world’s first BT speech corpus via The BrainApp Study. To the best of our knowledge, it is also the first study to employ remote, mobile-based BT speech collection methods. The corpus consists of 285 intelligible read speech audio samples from 35 participants, with 27 patients contributing 186 samples. Ground truth for participant audios was established through manual transcription of each file. Qualitative and quantitative analyses were performed on these transcriptions to characterise speech patterns. We then applied state-of-the-art (SOTA) artificial intelligence (AI) speech-to-text models (Whisper and Wav2Vec2) to each audio sample, and compared the results to the ground truth. RESULTS Ground truth analysis revealed that BT patients produced linguistic errors in the form of word repetition, insertion, substitution, and deletion. These appear to be influenced by tumour location, laterality, and grade. Neither Whisper nor Wav2Vec2 successfully captured these abnormalities. Furthermore, both models introduced errors not present in the original audio i.e. hallucinating. CONCLUSION BT-related speech impairments produce measurable linguistic errors that current AI models fail to recognise. Future work will focus on developing automated speech analytic systems informed by these objective speech markers and on refining existing SOTA models. We hope this will lay the foundations for speech analytics in neuro-oncology.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Topic ModelingBiomedical Text Mining and OntologiesArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen