OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 07:03

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Consensus automatic speech recognition (CASR) in the California Cognitive Assessment Battery (CCAB)

2022·1 Zitationen·Alzheimer s & Dementia
Volltext beim Verlag öffnen

1

Zitationen

11

Autoren

2022

Jahr

Abstract

Abstract Background Recent reports have investigated the use of automatic speech recognition (ASR) to analyze and score verbal responses in cognitive tests. ASR scoring is objective, permits the efficient computerized administration of verbal tests, and generates timestamps that enable the detailed temporal analysis of responses. However, ASR transcription accuracy varies by engine, task, and participant, and ASR can incorrectly score responses from participants with atypical speech patterns. Here we describe the speech‐transcription pipeline of the California Cognitive Assessment Battery (CCAB), which incorporates consensus ASR (CASR) to produce more accurate transcripts than possible with any single ASR engine. We also developed a Transcript Review Tool (TRT) which facilitates the manual correction of mis‐transcribed words in problem subjects. Method Figure 1 shows the CCAB speech transcription pipeline. Realtime ASR transcriptions are obtained along with the transcriptions of the digital recordings of responses using six cloud‐based ASR engines (e.g., Google, etc.). Individual transcripts are then combined to produce a “consensus” transcript, and a transcription confidence measure based primarily on the agreement between ASR engines (Figure 2). If needed, “consensus” transcripts can be manually corrected using the Transcript Review Tool which enables the review of all words or just those words below a predefined CASR confidence threshold (Figure 3). Result ASR transcriptions were obtained from 442 healthy adults (mean age = 65.1 ±14.4) who each underwent three days of cognitive testing that included 25 verbal tests. In all, approximately 276 hours of speech were transcribed. Preliminary analyses show that CASR transcription accuracy surpassed 99% for tests with limited response sets (e.g., digit span, verbal list learning, face‐name binding, etc.) and exceeded 95% for discursive speech tests (e.g., picture description and logical memory). Conclusion CASR transcription is more accurate than that of any single ASR engine. When combined with the TRT, “consensus” ASR can produce error‐free, timestamped transcripts that enable the detailed analysis of verbal responses from older individuals at risk of cognitive decline.

Ähnliche Arbeiten