Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Semi-Supervised Natural Language Approach for Fine-Grained\n Classification of Medical Reports

2019·1 Zitationen·arXiv (Cornell University)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2019

Jahr

Abstract

Although machine learning has become a powerful tool to augment doctors in\nclinical analysis, the immense amount of labeled data that is necessary to\ntrain supervised learning approaches burdens each development task as time and\nresource intensive. The vast majority of dense clinical information is stored\nin written reports, detailing pertinent patient information. The challenge with\nutilizing natural language data for standard model development is due to the\ncomplex nature of the modality. In this research, a model pipeline was\ndeveloped to utilize an unsupervised approach to train an encoder-language\nmodel, a recurrent network, to generate document encodings; which then can be\nused as features passed into a decoder-classifier model that requires\nmagnitudes less labeled data than previous approaches to differentiate between\nfine-grained disease classes accurately. The language model was trained on\nunlabeled radiology reports from the Massachusetts General Hospital Radiology\nDepartment (n=218,159) and terminated with a loss of 1.62. The classification\nmodels were trained on three labeled datasets of head CT studies of reported\npatients, presenting large vessel occlusion (n=1403), acute ischemic strokes\n(n=331), and intracranial hemorrhage (n=4350), to identify a variety of\ndifferent findings directly from the radiology report data; resulting in AUCs\nof 0.98, 0.95, and 0.99, respectively, for the large vessel occlusion, acute\nischemic stroke, and intracranial hemorrhage datasets. The output encodings are\nable to be used in conjunction with imaging data, to create models that can\nprocess a multitude of different modalities. The ability to automatically\nextract relevant features from textual data allows for faster model development\nand integration of textual modality, overall, allowing clinical reports to\nbecome a more viable input for more encompassing and accurate deep learning\nmodels.\n

Autoren

Themen

Radiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and EducationMachine Learning in Healthcare

Volltext beim Verlag öffnen

Semi-Supervised Natural Language Approach for Fine-Grained\n Classification of Medical Reports

Abstract

Ähnliche Arbeiten

Autoren

Themen