Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Machine Learning Explainability and Robustness

2021·15 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2021

Jahr

Abstract

This tutorial examines the synergistic relationship between explainability methods for machine learning and a significant problem related to model quality: robustness against adversarial perturbations. We begin with a broad overview of approaches to explainable AI, before narrowing our focus to post-hoc explanation methods for predictive models. We discuss perspectives on what constitutes a "good'' explanation in various settings, with an emphasis on axiomatic justifications for various explanation methods. In doing so, we will highlight the importance of an explanation method's faithfulness to the target model, as this property allows one to distinguish between explanations that are unintelligible because of the method used to produce them, and cases where a seemingly poor explanation points to model quality issues. Next, we introduce concepts surrounding adversarial robustness, including adversarial attacks as well as a range of corresponding state-of-the-art defenses. Finally, building on the knowledge presented thus far, we present key insights from the recent literature on the connections between explainability and robustness, showing that many commonly-perceived explainability issues may be caused by non-robust model behavior. Accordingly, a careful study of adversarial examples and robustness can lead to models whose explanations better appeal to human intuition and domain knowledge.

Autoren

Institutionen

Themen

Adversarial Robustness in Machine LearningExplainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Machine Learning Explainability and Robustness

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen