Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Modeling disagreement in automatic data labeling for semi-supervised learning in Clinical Natural Language Processing

2024·2 Zitationen·Frontiers in Artificial IntelligenceOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Introduction: Computational models providing accurate estimates of their uncertainty are crucial for risk management associated with decision-making in healthcare contexts. This is especially true since many state-of-the-art systems are trained using the data which have been labeled automatically (self-supervised mode) and tend to overfit. Methods: In this study, we investigate the quality of uncertainty estimates from a range of current state-of-the-art predictive models applied to the problem of observation detection in radiology reports. This problem remains understudied for Natural Language Processing in the healthcare domain. Results: We demonstrate that Gaussian Processes (GPs) provide superior performance in quantifying the risks of three uncertainty labels based on the negative log predictive probability (NLPP) evaluation metric and mean maximum predicted confidence levels (MMPCL), whilst retaining strong predictive performance. Discussion: Our conclusions highlight the utility of probabilistic models applied to "noisy" labels and that similar methods could provide utility for Natural Language Processing (NLP) based automated labeling tasks.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

Modeling disagreement in automatic data labeling for semi-supervised learning in Clinical Natural Language Processing

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen