OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.05.2026, 22:20

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Exploring Transformer Text Generation for Medical Dataset Augmentation.

2020·39 Zitationen·Research Portal (King's College London)
Volltext beim Verlag öffnen

39

Zitationen

3

Autoren

2020

Jahr

Abstract

Natural Language Processing (NLP) can help unlock the vast troves of unstructured data in clinical text and thus improve healthcare research. However, a big barrier to developments in this field is data access due to patient confidentiality which prohibits the sharing of this data, resulting in small, fragmented and sequestered openly available datasets. Since NLP model development requires large quantities of data, we aim to help side-step this roadblock by exploring the usage of Natural Language Generation in augmenting datasets such that they can be used for NLP model development on downstream clinically relevant tasks. We propose a methodology guiding the generation with structured patient information in a sequence-to-sequence manner. We experiment with state-of-the-art Transformer models and demonstrate that our augmented dataset is capable of beating our baselines on a downstream classification task. Finally, we also create a user interface and release the scripts to train generation models to stimulate further research in this area.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Topic ModelingMachine Learning in HealthcareBiomedical Text Mining and Ontologies
Volltext beim Verlag öffnen