Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency
0
Zitationen
2
Autoren
2022
Jahr
Abstract
A well-calibrated neural model produces confidence (probability outputs) closely approximated by the expected accuracy. While prior studies have shown that mixup training as a data augmentation technique can improve model calibration on image classification tasks, little is known about using mixup for model calibration on natural language understanding (NLU) tasks. In this paper, we explore mixup for model calibration on several NLU tasks and propose a novel mixup strategy for pre-trained language models that improves model calibration further. Our proposed mixup is guided by both the Area Under the Margin (AUM) statistic Moreover, we combine our mixup strategy with model miscalibration correction techniques (i.e., label smoothing and temperature scaling) and provide detailed analyses of their impact on our proposed mixup. We focus on systematically designing experiments on three NLU tasks: natural language inference, paraphrase detection, and commonsense reasoning. Our method achieves the lowest expected calibration error compared to strong baselines on both indomain and out-of-domain test samples while maintaining competitive accuracy.
Ähnliche Arbeiten
MizAR 60 for Mizar 50
2023 · 74.517 Zit.
ImageNet: A large-scale hierarchical image database
2009 · 60.624 Zit.
Microsoft COCO: Common Objects in Context
2014 · 41.271 Zit.
Fully convolutional networks for semantic segmentation
2015 · 36.380 Zit.
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.463 Zit.