OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.05.2026, 16:33

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Interpretable Neural Predictions with Differentiable Binary Variables

2019·31 ZitationenOpen Access
Volltext beim Verlag öffnen

31

Zitationen

3

Autoren

2019

Jahr

Abstract

The success of neural networks comes hand in hand with a desire for more interpretability. We focus on text classifiers and make them more interpretable by having them provide a justification-a rationale-for their predictions. We approach this problem by jointly training two neural network models: a latent model that selects a rationale (i.e. a short and informative part of the input text), and a classifier that learns from the words in the rationale alone. Previous work proposed to assign binary latent masks to input positions and to promote short selections via sparsityinducing penalties such as L 0 regularisation. We propose a latent model that mixes discrete and continuous behaviour allowing at the same time for binary selections and gradient-based training without REINFORCE. In our formulation, we can tractably compute the expected value of penalties such as L 0 , which allows us to directly optimise the model towards a prespecified text selection rate. We show that our approach is competitive with previous work on rationale extraction, and explore further uses in attention mechanisms.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Topic ModelingExplainable Artificial Intelligence (XAI)Machine Learning in Healthcare
Volltext beim Verlag öffnen