Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
322 Distributed Vector Representations of Neurosurgical Admission Notes Predict Discharge Disposition
0
Zitationen
2
Autoren
2023
Jahr
Abstract
INTRODUCTION: Natural language processing (NLP) aims to extract information from unstructured language input for use in machine learning, without the need for laborious manual review of text to retrieve data of potential interest. An NLP-based model utilizing term frequency-inverse document frequency (TF-IDF) vectoral representations of pre-operative clinical notes has previously been shown to predict discharge disposition following meningioma resection. The TF-IDF vectorization scheme is based on word counts across documents but does not account for word order or semantic information. An alternative algorithm, paragraph vectorization, utilizes an unsupervised neural network to learn document vector representations. METHODS: We analyzed 7,122 admission notes. We trained a neural network to derive paragraph vector representations of notes and trained a classifier to predict home vs non-home discharge disposition based on these representations. Out-of-sample performance was assessed on a held-out testing set. We repeated the analysis using TF-IDF document representations to compare the two approaches. Performance was quantified using area under the receiver operating characteristic curve (ROC-AUC). RESULTS: Model performance increased with vector dimensionality. A 200-dimensional distributed document vectorization was highly predictive of nonhome discharge with an ROC-AUC of 0.82 ± 0.01 (mean ± standard error over 5-fold nested cross-validation). When the dimensionality of the embedding vector space was relatively low (e.g. <100), learned paragraph vectors predicted discharge disposition better than TF-IDF vectors. CONCLUSIONS: Natural language processing of admission notes via neural network-learned distributed vectoral representations is a viable strategy of information extraction from neurosurgical clinical documentation. This embedding scheme overcomes some of the limitations of TF-IDF and encodes more information in lower-dimensional embedding vector spaces. It holds promise as a method of using free text to train machine learning algorithms for prediction of clinically relevant outcomes.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.336 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.207 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.607 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.476 Zit.