Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

322 Distributed Vector Representations of Neurosurgical Admission Notes Predict Discharge Disposition

2023·0 Zitationen·Neurosurgery

Volltext beim Verlag öffnen

Zitationen

Autoren

2023

Jahr

Abstract

INTRODUCTION: Natural language processing (NLP) aims to extract information from unstructured language input for use in machine learning, without the need for laborious manual review of text to retrieve data of potential interest. An NLP-based model utilizing term frequency-inverse document frequency (TF-IDF) vectoral representations of pre-operative clinical notes has previously been shown to predict discharge disposition following meningioma resection. The TF-IDF vectorization scheme is based on word counts across documents but does not account for word order or semantic information. An alternative algorithm, paragraph vectorization, utilizes an unsupervised neural network to learn document vector representations. METHODS: We analyzed 7,122 admission notes. We trained a neural network to derive paragraph vector representations of notes and trained a classifier to predict home vs non-home discharge disposition based on these representations. Out-of-sample performance was assessed on a held-out testing set. We repeated the analysis using TF-IDF document representations to compare the two approaches. Performance was quantified using area under the receiver operating characteristic curve (ROC-AUC). RESULTS: Model performance increased with vector dimensionality. A 200-dimensional distributed document vectorization was highly predictive of nonhome discharge with an ROC-AUC of 0.82 ± 0.01 (mean ± standard error over 5-fold nested cross-validation). When the dimensionality of the embedding vector space was relatively low (e.g. <100), learned paragraph vectors predicted discharge disposition better than TF-IDF vectors. CONCLUSIONS: Natural language processing of admission notes via neural network-learned distributed vectoral representations is a viable strategy of information extraction from neurosurgical clinical documentation. This embedding scheme overcomes some of the limitations of TF-IDF and encodes more information in lower-dimensional embedding vector spaces. It holds promise as a method of using free text to train machine learning algorithms for prediction of clinically relevant outcomes.

Autoren

Themen

Artificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

322 Distributed Vector Representations of Neurosurgical Admission Notes Predict Discharge Disposition

Abstract

Ähnliche Arbeiten

Autoren

Themen