Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Machine Learning Explainability and Robustness
15
Zitationen
6
Autoren
2021
Jahr
Abstract
This tutorial examines the synergistic relationship between explainability methods for machine learning and a significant problem related to model quality: robustness against adversarial perturbations. We begin with a broad overview of approaches to explainable AI, before narrowing our focus to post-hoc explanation methods for predictive models. We discuss perspectives on what constitutes a "good'' explanation in various settings, with an emphasis on axiomatic justifications for various explanation methods. In doing so, we will highlight the importance of an explanation method's faithfulness to the target model, as this property allows one to distinguish between explanations that are unintelligible because of the method used to produce them, and cases where a seemingly poor explanation points to model quality issues. Next, we introduce concepts surrounding adversarial robustness, including adversarial attacks as well as a range of corresponding state-of-the-art defenses. Finally, building on the knowledge presented thus far, we present key insights from the recent literature on the connections between explainability and robustness, showing that many commonly-perceived explainability issues may be caused by non-robust model behavior. Accordingly, a careful study of adversarial examples and robustness can lead to models whose explanations better appeal to human intuition and domain knowledge.
Ähnliche Arbeiten
Rethinking the Inception Architecture for Computer Vision
2016 · 30.416 Zit.
MobileNetV2: Inverted Residuals and Linear Bottlenecks
2018 · 24.552 Zit.
CBAM: Convolutional Block Attention Module
2018 · 21.448 Zit.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020 · 21.347 Zit.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
2015 · 18.535 Zit.