Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

2022·0 Zitationen·Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2022

Jahr

Abstract

A well-calibrated neural model produces confidence (probability outputs) closely approximated by the expected accuracy. While prior studies have shown that mixup training as a data augmentation technique can improve model calibration on image classification tasks, little is known about using mixup for model calibration on natural language understanding (NLU) tasks. In this paper, we explore mixup for model calibration on several NLU tasks and propose a novel mixup strategy for pre-trained language models that improves model calibration further. Our proposed mixup is guided by both the Area Under the Margin (AUM) statistic Moreover, we combine our mixup strategy with model miscalibration correction techniques (i.e., label smoothing and temperature scaling) and provide detailed analyses of their impact on our proposed mixup. We focus on systematically designing experiments on three NLU tasks: natural language inference, paraphrase detection, and commonsense reasoning. Our method achieves the lowest expected calibration error compared to strong baselines on both indomain and out-of-domain test samples while maintaining competitive accuracy.

Autoren

Institutionen

University of Illinois Chicago(US)

Themen

Multimodal Machine Learning ApplicationsTopic ModelingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen