Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

1256 Generative AI-Based Dictation Outperforms Existing Speech-to-Text Software in Neurosurgical Practice

2025·0 Zitationen·Neurosurgery

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

INTRODUCTION: The rapid ascent of artificial intelligence (AI) and advanced machine learning has revolutionized certain aspects of healthcare. Document dictation remains a significant clinical burden, contributing to burnout and detracting from quality patient care. Generative AI systems utilizing transformer-based technology offer efficient speech processing methods, adaptation of which could streamline the work of clinical documentation. METHODS: Ten previously written operative reports from both cranial and spinal neurosurgery procedures were dictated and recorded by three independent physicians. The corresponding audio files were processed by (1) a modified speech-to-text model implemented based on a backbone architecture created by OpenAI’s Whisper model and (2) Nuance Dragon as a comparative commercial standard. Word error rate (WER) was manually reviewed. RESULTS: Mean WER was 1.75% for Whisper and 1.54% for Dragon (p=0.08). Mean total error was 12.1 for Whisper and 11.1 for Dragon (p=0.20). When excluding linguistic errors, Whisper outperformed Dragon with a mean WER of 0.50% vs 1.34% (p<0.001), including total errors (6.1 vs 9.7, p=0.002). For all unstratified dictations, a positive correlation was seen between total errors and word count (p<0.0001, R2=0.37) and between total errors and recording length (p<0.0001, R2=0.22). A positive correlation was seen between words spoken per second and total errors for Dragon (p<0.020, R2=0.18), but not Whisper (p<0.204, R2=0.06). When analyzing linguistic errors, this trend held for Dragon (p<0.014, R2=0.20), but not Whisper (p<0.331, R2=0.03). CONCLUSIONS: AI-based models perform at a non-inferior rate to commercially available speech-to-text dictation programs. Generative models provide potential benefits such as contextual inference that show promise in limiting errors with increased dictation speed or adjustment for impure input data.

Autoren

Themen

Artificial Intelligence in Healthcare and EducationMedical Imaging and Analysis

Volltext beim Verlag öffnen

1256 Generative AI-Based Dictation Outperforms Existing Speech-to-Text Software in Neurosurgical Practice

Abstract

Ähnliche Arbeiten

Autoren

Themen